Focusing on What Matters: Saliency-Harnessing Accurate Routing for Diffusion MoE Paper • 2606.26938 • Published 9 days ago • 5
OSCAR: Offline Spectral Covariance-Aware Rotation for 2-bit KV Cache Quantization Paper • 2605.17757 • Published May 18 • 66
WorldKV: Efficient World Memory with World Retrieval and Compression Paper • 2605.22718 • Published May 21 • 42
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published May 12 • 196
Code-as-Room: Generating 3D Rooms from Top-Down View Images via Agentic Code Synthesis Paper • 2605.18451 • Published May 18 • 41
Learning to Foresee: Unveiling the Unlocking Efficiency of On-Policy Distillation Paper • 2605.11739 • Published May 13 • 60
LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published Apr 22 • 244
Synthetic Sandbox for Training Machine Learning Engineering Agents Paper • 2604.04872 • Published Apr 6 • 14
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published Apr 8 • 329
MegaStyle: Constructing Diverse and Scalable Style Dataset via Consistent Text-to-Image Style Mapping Paper • 2604.08364 • Published Apr 9 • 103
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published Apr 3 • 638
CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence Paper • 2603.28032 • Published Mar 30 • 344
From Blind Spots to Gains: Diagnostic-Driven Iterative Training for Large Multimodal Models Paper • 2602.22859 • Published Feb 26 • 150