Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
- Website
- Community
- Solutions
Log In
Sign Up

Yulei Qin's picture

In a Training Loop 🔄

Yulei Qin

yolay

21world's profile picture

John6666's profile picture

hansuk9791's profile picture

·

https://yuleichin.github.io/

yulei_qin
yuleiqin

AI & ML interests

Medical Imaging, Computer Vision, Language Models

Organizations

yolay 's collections 4

The checkpoints of the models trained with Youtu-Agent RL for Code/Math and Search tasks.

yolay/Youtu-Agent-RL-Search-Qwen2.5-7B

Text Generation • 8B • Updated Jan 16 • 2 • 1
yolay/Youtu-Agent-RL-Maths-Qwen2.5-7B

Text Generation • 8B • Updated Jan 16 • 3 • 3
Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization

Paper • 2512.24615 • Published Dec 31, 2025 • 119

Checkpoints "Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning" arxiv [2509.22601]

yolay/SPEAR-Sokoban-DrBoT-GiGPO-3B

4B • Updated Oct 15, 2025 • 3
yolay/SPEAR-WebShop-DrBoT-GiGPO-7B

8B • Updated Oct 15, 2025 • 5
yolay/SPEAR-WebShop-DrBoT-GRPO-7B

8B • Updated Oct 15, 2025 • 2
yolay/SPEAR-WebShop-DrBoT-GiGPO-1.5B

2B • Updated Oct 15, 2025 • 3

Data and Checkpoints of "SmartSnap: Proactive Evidence Seeking for Self-Verifying Agents" [arxiv.org/abs/2512.22322]

yolay/SmartSnap-FT

Updated Jan 4 • 93
yolay/SmartSnap-RL

Preview • Updated Jan 4 • 248
yolay/SmartSnap-Qwen2.5-7B

8B • Updated Jan 4 • 4
yolay/SmartSnap-Qwen3-8B

8B • Updated Jan 4 • 2 • 1

Datasets and models in the paper "Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models" [github.com/yuleiqin/RAIF].

yolay/RAIF-Qwen2.5-1.5B

Text Generation • 2B • Updated Jul 31, 2025 • 1
yolay/RAIF-Qwen2.5-7B

Text Generation • 8B • Updated Jul 31, 2025 • 4
yolay/RAIF-Ministral-8B

Text Generation • 8B • Updated Jul 31, 2025 • 4 • 1
yolay/RAIF-DeepScaleR-1.5B

Text Generation • 2B • Updated Jul 31, 2025 • 9

The checkpoints of the models trained with Youtu-Agent RL for Code/Math and Search tasks.

yolay/Youtu-Agent-RL-Search-Qwen2.5-7B

Text Generation • 8B • Updated Jan 16 • 2 • 1
yolay/Youtu-Agent-RL-Maths-Qwen2.5-7B

Text Generation • 8B • Updated Jan 16 • 3 • 3
Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization

Paper • 2512.24615 • Published Dec 31, 2025 • 119

Data and Checkpoints of "SmartSnap: Proactive Evidence Seeking for Self-Verifying Agents" [arxiv.org/abs/2512.22322]

yolay/SmartSnap-FT

Updated Jan 4 • 93
yolay/SmartSnap-RL

Preview • Updated Jan 4 • 248
yolay/SmartSnap-Qwen2.5-7B

8B • Updated Jan 4 • 4
yolay/SmartSnap-Qwen3-8B

8B • Updated Jan 4 • 2 • 1

Checkpoints "Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning" arxiv [2509.22601]

yolay/SPEAR-Sokoban-DrBoT-GiGPO-3B

4B • Updated Oct 15, 2025 • 3
yolay/SPEAR-WebShop-DrBoT-GiGPO-7B

8B • Updated Oct 15, 2025 • 5
yolay/SPEAR-WebShop-DrBoT-GRPO-7B

8B • Updated Oct 15, 2025 • 2
yolay/SPEAR-WebShop-DrBoT-GiGPO-1.5B

2B • Updated Oct 15, 2025 • 3

Datasets and models in the paper "Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models" [github.com/yuleiqin/RAIF].

yolay/RAIF-Qwen2.5-1.5B

Text Generation • 2B • Updated Jul 31, 2025 • 1
yolay/RAIF-Qwen2.5-7B

Text Generation • 8B • Updated Jul 31, 2025 • 4
yolay/RAIF-Ministral-8B

Text Generation • 8B • Updated Jul 31, 2025 • 4 • 1
yolay/RAIF-DeepScaleR-1.5B

Text Generation • 2B • Updated Jul 31, 2025 • 9

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs