Running 191 The ultimate guide to RL environments: building and scaling them in the LLM era 📝 191 Building and scaling RL environments for LLM training
view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 aminediroHF, qgallouedec, kashif, lewtun, edbeeching, albertvillanova, nouamanetazi, lvwerra, sergiopaniego • Mar 10 • 164
Running 114 Unlocking On-Policy Distillation for Any Model Family 📝 114 Explore on-policy distillation visualization for any model
Running Featured 88 Distilling 100B+ Models 40x Faster with TRL 📝 88 TRL distillation for 100B+ teachers, 40x faster
view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 merve, pcuenq, sergiopaniego, burtenshaw, Steveeeeeeen, alvarobartt, SaylorTwift • Apr 2 • 909
Running on CPU Upgrade Agents 1.02k Open VLM Leaderboard 🌎 1.02k VLMEvalKit Evaluation Results Collection
view article Article Introducing smolagents: simple agents that write actions in code. +1 m-ric, merve, thomwolf • Dec 31, 2024 • 1.2k
view article Article Vision Language Models (Better, faster, stronger) +3 merve, sergiopaniego, ariG23498, pcuenq, andito • May 12, 2025 • 614
view article Article 视觉语言模型 (更好、更快、更强) +3 merve, sergiopaniego, ariG23498, pcuenq, andito • May 12, 2025 • 17
Running on CPU Upgrade 261 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 261 Visualize synthetic‑data experiments as an interactive bookshelf
Running Featured 74 QED-Nano: Teaching a Tiny Model to Prove Hard Theorems 📝 74 Who needs 1T parameters? Olympiad proofs with a 4B model
view article Article Mixture of Experts (MoEs) in Transformers +5 ariG23498, pcuenq, merve, IlyasMoutawwakil, ArthurZ, sergiopaniego, Molbap • Feb 26 • 169
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 lysandre, ArthurZ, cyrilvallez, reach-vb • Dec 1, 2025 • 311