Improving Vision-language Models with Perception-centric Process Reward Models Paper • 2604.24583 • Published 11 days ago • 3
drkvcsstvn/smearshare_distribution_activity_lims_fast Viewer • Updated about 5 hours ago • 17 • 359 • 1
CUE-R: Beyond the Final Answer in Retrieval-Augmented Generation Paper • 2604.05467 • Published Apr 7 • 7
An Efficient Heterogeneous Co-Design for Fine-Tuning on a Single GPU Paper • 2603.16428 • Published Mar 17 • 51
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization Paper • 2603.19835 • Published Mar 20 • 350
Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning Paper • 2603.04597 • Published Mar 4 • 210