Co-Evolving LLM Decision and Skill Bank Agents for Long-Horizon Tasks Paper • 2604.20987 • Published 8 days ago • 21
Self-Rewarding Vision-Language Model via Reasoning Decomposition Paper • 2508.19652 • Published Aug 27, 2025 • 84
CFMatch: Aligning Automated Answer Equivalence Evaluation with Expert Judgments For Open-Domain Question Answering Paper • 2401.13170 • Published Jan 24, 2024 • 4