From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning Paper • 2606.17682 • Published 11 days ago • 26
Data Journalist Agent: Transforming Data into Verifiable Multimodal Stories Paper • 2606.11176 • Published 18 days ago • 127
Demystifying Hidden-State Recurrence: Switchable Latent Reasoning with On-Policy Reinforcement Learning Paper • 2606.13106 • Published 16 days ago • 21
Reinforcement Learning Elicits Contextual Learning of Unseen Language Translation Paper • 2606.06428 • Published 23 days ago • 25
You Only Need Minimal RLVR Training: Extrapolating LLMs via Rank-1 Trajectories Paper • 2605.21468 • Published May 20 • 51
Video2GUI: Synthesizing Large-Scale Interaction Trajectories for Generalized GUI Agent Pretraining Paper • 2605.14747 • Published May 14 • 147
ThoughtTrace: Understanding User Thoughts in Real-World LLM Interactions Paper • 2605.20087 • Published May 19 • 18
EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL Paper • 2605.18703 • Published May 18 • 50