OpenSkillEval: Automatically Auditing the Open Skill Ecosystem for LLM Agents Paper • 2605.23657 • Published 8 days ago • 7
Crafter: A Multi-Agent Harness for Editable Scientific Figure Generation from Diverse Inputs Paper • 2605.30611 • Published 8 days ago • 185
RoboSemanticBench: Diagnosing Semantic Grounding in Action Prediction for VLA Models Paper • 2606.02277 • Published 4 days ago • 7
Towards Verifiable Multimodal Deep Research: A Multi-Agent Harness for Interleaved Report Generation Paper • 2605.29861 • Published 8 days ago • 16
AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration Paper • 2605.20025 • Published 17 days ago • 185
RigidFormer: Learning Rigid Dynamics using Transformers Paper • 2605.09196 • Published 27 days ago • 14
MDN: Parallelizing Stepwise Momentum for Delta Linear Attention Paper • 2605.05838 • Published 29 days ago • 5
Are We Making Progress in Multimodal Domain Generalization? A Comprehensive Benchmark Study Paper • 2605.06643 • Published 29 days ago • 4
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 504