Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Paper โข 2604.13016 โข Published 13 days ago โข 87
UltraData Collection Ultra Scale, Ultra Quality, Ultra Coverage โข 10 items โข Updated 10 days ago โข 81
InfLLM-V2: Dense-Sparse Switchable Attention for Seamless Short-to-Long Adaptation Paper โข 2509.24663 โข Published Sep 29, 2025 โข 16