LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper ⢠2604.20796 ⢠Published Apr 22 ⢠244
HY-WU (Part I): An Extensible Functional Neural Memory Framework and An Instantiation in Text-Guided Image Editing Paper ⢠2603.07236 ⢠Published Mar 7 ⢠3
Nested Learning: The Illusion of Deep Learning Architectures Paper ⢠2512.24695 ⢠Published Dec 31, 2025 ⢠46
Diversity or Precision? A Deep Dive into Next Token Prediction Paper ⢠2512.22955 ⢠Published Dec 28, 2025 ⢠10
One-Token Rollout: Guiding Supervised Fine-Tuning of LLMs with Policy Gradient Paper ⢠2509.26313 ⢠Published Sep 30, 2025 ⢠5
Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts for Instruction Tuning on General Tasks Paper ⢠2401.02731 ⢠Published Jan 5, 2024 ⢠3
GroveMoE Collection GroveMoE is an open-source family of large language models developed by the AGI Center, Ant Research Institute. ⢠3 items ⢠Updated 12 days ago ⢠9
Grove MoE: Towards Efficient and Superior MoE LLMs with Adjugate Experts Paper ⢠2508.07785 ⢠Published Aug 11, 2025 ⢠30
Cosmos-Preidct1 Collection ā ļø This collection is archived. š https://huggingface.co/collections/nvidia/cosmos3 ⢠14 items ⢠Updated 15 days ago ⢠304
Tulu 3 Datasets Collection All datasets released with Tulu 3 -- state of the art open post-training recipes. ⢠32 items ⢠Updated Mar 2 ⢠100