蔡正舟's picture

7

蔡正舟

conctsai

AI & ML interests

None yet

Recent Activity

upvoted a paper 6 days ago

HodgeCover: Higher-Order Topological Coverage Drives Compression of Sparse Mixture-of-Experts

upvoted a paper 6 days ago

Look Before You Leap: Autonomous Exploration for LLM Agents

upvoted a paper 6 days ago

Learning from Failures: Correction-Oriented Policy Optimization with Verifiable Rewards

View all activity

Organizations

None yet

upvoted 6 papers 6 days ago

HodgeCover: Higher-Order Topological Coverage Drives Compression of Sparse Mixture-of-Experts

Paper • 2605.13997 • Published 11 days ago • 5

Look Before You Leap: Autonomous Exploration for LLM Agents

Paper • 2605.16143 • Published 9 days ago • 9

Learning from Failures: Correction-Oriented Policy Optimization with Verifiable Rewards

Paper • 2605.14539 • Published 10 days ago • 5

Learning to Foresee: Unveiling the Unlocking Efficiency of On-Policy Distillation

Paper • 2605.11739 • Published 11 days ago • 55

Distilling Long-CoT Reasoning through Collaborative Step-wise Multi-Teacher Decoding

Paper • 2605.02290 • Published 20 days ago • 39

Nudging Beyond the Comfort Zone: Efficient Strategy-Guided Exploration for RLVR

Paper • 2605.15726 • Published 9 days ago • 32

upvoted a paper 26 days ago

SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

Paper • 2604.02268 • Published Apr 2 • 101