A Matter of TASTE: Improving Coverage and Difficulty of Agent Benchmarks
Paper • 2605.28556 • Published • 54
None defined yet.
A Matter of TASTE: Improving Coverage and Difficulty of Agent Benchmarks
Predicting Decisions of AI Agents from Limited Interaction through Text-Tabular Modeling