TreeSeeker: Tree-Structured Trial, Error, and Return in Deep Search Paper • 2606.11662 • Published 3 days ago • 9
POISE: Position-Aware Undetectable Skill Injection on LLM Agents Paper • 2606.07943 • Published 7 days ago • 4
Running 113 Unlocking On-Policy Distillation for Any Model Family 📝 113 Explore on-policy distillation visualization for any model
view article Article Apriel-H1: The Surprising Key to Distilling Efficient Reasoning Models ServiceNow-AI • Nov 19, 2025 • 34
Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn LLM Agents Paper • 2510.14967 • Published Oct 16, 2025 • 34