From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company Paper • 2604.22446 • Published Apr 24 • 121
view article Article DeepSeek-V4: a million-token context that agents can actually use burtenshaw • Apr 24 • 47
DFlash Collection Block Diffusion for Flash Speculative Decoding • 21 items • Updated 20 days ago • 122
view article Article AI and the Future of Cybersecurity: Why Openness Matters +1 meg, yjernite, clem • Apr 21 • 38
Nemotron-Personas Collection A collection of multilingual, region-specific synthetic persona datasets that support sovereign AI development across many countries and regions. • 7 items • Updated 1 day ago • 45
PGC Psychiatric GWAS Summary Statistics Collection ~1 billion rows of genome-wide association study (GWAS) NOTE: We are in the process to transfer these datasets to the Psychiatric Genomics Consortiu • 12 items • Updated Apr 14 • 91
view article Article Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face dvgodoy • Feb 11, 2025 • 123
view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 aminediroHF, qgallouedec, kashif, lewtun, edbeeching, albertvillanova, nouamanetazi, lvwerra, sergiopaniego • Mar 10 • 158
view article Article Nemotron-Personas: Improve AI Training With the First Synthetic Personas Dataset Aligned to Real-World Distributions nvidia • Jun 10, 2025 • 25