Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs Paper • 2605.09063 • Published 28 days ago • 80
deepseek-ai/DeepSeek-V4-Pro Text Generation • 862B • Updated about 1 month ago • 5.56M • • 4.65k
HriDal/agent-2048-game-qwen-7b-2k-ds Reinforcement Learning • 8B • Updated Apr 1, 2025 • 2 • 1
deepseek-ai/DeepSeek-V3.2-Speciale Text Generation • 685B • Updated Dec 1, 2025 • 12.7k • • 708
Alibaba-NLP/Tongyi-DeepResearch-30B-A3B Text Generation • 31B • Updated Oct 10, 2025 • 32.8k • 814