Papers Explored

Papers Explored by categories in reversed chronological order of publication. generated by jekyll-scholar.

2025

  1. International AI Safety Report
    Yoshua Bengio, Sören Mindermann, Daniel Privitera, and 8 more authors
    arXiv preprint arXiv:2501.17805, 2025
  2. On the Biology of a Large Language Model
    Jack Lindsey, Wes Gurnee, Emmanuel Ameisen, and 24 more authors
    Transformer Circuits Thread, 2025
  3. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning
    Daya Guo, Dejian Yang, Haowei Zhang, and 8 more authors
    arXiv preprint arXiv:2501.12948, 2025

2024

  1. NEURAL NETWORK COMPRESSION: THE FUNCTIONAL PERSPECTIVE
    Israel Mason-Williams
    In 5th Workshop on practical ML for limited/low resource settings, 2024
  2. Knowledge Distillation: The Functional Perspective
    Israel Mason-Williams, Gabryel Mason-Williams, and Mark Sandler
    In NeurIPS 2024 Workshop on Scientific Methods for Understanding Deep Learning, 2024
  3. Deepseek-v2: A strong, economical, and efficient mixture-of-experts language model
    Aixin Liu, Bei Feng, Bin Wang, and 8 more authors
    arXiv preprint arXiv:2405.04434, 2024
  4. Deepseek-v3 technical report
    Aixin Liu, Bei Feng, Bing Xue, and 8 more authors
    arXiv preprint arXiv:2412.19437, 2024
  5. Better & faster large language models via multi-token prediction
    Fabian Gloeckle, Badr Youbi Idrissi, Baptiste Rozière, and 2 more authors
    arXiv preprint arXiv:2404.19737, 2024
  6. Deepseekmath: Pushing the limits of mathematical reasoning in open language models
    Zhihong Shao, Peiyi Wang, Qihao Zhu, and 8 more authors
    arXiv preprint arXiv:2402.03300, 2024

2023

  1. Amortizing intractable inference in large language models
    Edward J Hu, Moksh Jain, Eric Elmoznino, and 4 more authors
    arXiv preprint arXiv:2310.04363, 2023
  2. Gflownet foundations
    Yoshua Bengio, Salem Lahlou, Tristan Deleu, and 3 more authors
    Journal of Machine Learning Research, 2023
  3. Learning gflownets from partial episodes for improved convergence and stability
    Kanika Madan, Jarrid Rector-Brooks, Maksym Korablyov, and 6 more authors
    In International Conference on Machine Learning, 2023

2022

  1. Trajectory balance: Improved credit assignment in gflownets
    Nikolay Malkin, Moksh Jain, Emmanuel Bengio, and 2 more authors
    Advances in Neural Information Processing Systems, 2022