QVal: Cheaply Evaluating Dense Supervision Signals for Long-Horizon LLM Agents Paper • 2606.32034 • Published 3 days ago • 9
ViPlan: A Benchmark for Visual Planning with Symbolic Predicates and Vision-Language Models Paper • 2505.13180 • Published May 19, 2025 • 13
view article Article Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent +2 qgallouedec, edbeeching, ClementRomac, thomwolf • Apr 22, 2024 • 81