Papers Explored | Kings AI Reading Group

2025

International AI Safety Report

Yoshua Bengio, Sören Mindermann, Daniel Privitera, and 8 more authors

arXiv preprint arXiv:2501.17805, 2025
On the Biology of a Large Language Model

Jack Lindsey, Wes Gurnee, Emmanuel Ameisen, and 24 more authors

Transformer Circuits Thread, 2025
Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning

Daya Guo, Dejian Yang, Haowei Zhang, and 8 more authors

arXiv preprint arXiv:2501.12948, 2025

NEURAL NETWORK COMPRESSION: THE FUNCTIONAL PERSPECTIVE

Israel Mason-Williams

In 5th Workshop on practical ML for limited/low resource settings, 2024

HTML
Knowledge Distillation: The Functional Perspective

Israel Mason-Williams, Gabryel Mason-Williams, and Mark Sandler

In NeurIPS 2024 Workshop on Scientific Methods for Understanding Deep Learning, 2024

HTML
Deepseek-v2: A strong, economical, and efficient mixture-of-experts language model

Aixin Liu, Bei Feng, Bin Wang, and 8 more authors

arXiv preprint arXiv:2405.04434, 2024
Deepseek-v3 technical report

Aixin Liu, Bei Feng, Bing Xue, and 8 more authors

arXiv preprint arXiv:2412.19437, 2024
Better & faster large language models via multi-token prediction

Fabian Gloeckle, Badr Youbi Idrissi, Baptiste Rozière, and 2 more authors

arXiv preprint arXiv:2404.19737, 2024
Deepseekmath: Pushing the limits of mathematical reasoning in open language models

Zhihong Shao, Peiyi Wang, Qihao Zhu, and 8 more authors

arXiv preprint arXiv:2402.03300, 2024

Amortizing intractable inference in large language models

Edward J Hu, Moksh Jain, Eric Elmoznino, and 4 more authors

arXiv preprint arXiv:2310.04363, 2023
Gflownet foundations

Yoshua Bengio, Salem Lahlou, Tristan Deleu, and 3 more authors

Journal of Machine Learning Research, 2023
Learning gflownets from partial episodes for improved convergence and stability

Kanika Madan, Jarrid Rector-Brooks, Maksym Korablyov, and 6 more authors

In International Conference on Machine Learning, 2023

Trajectory balance: Improved credit assignment in gflownets

Nikolay Malkin, Moksh Jain, Emmanuel Bengio, and 2 more authors

Advances in Neural Information Processing Systems, 2022