OddGridBench: Exposing the Lack of Fine-Grained Visual Discrepancy Sensitivity in Multimodal Large Language Models Paper • 2603.09326 • Published Mar 10 • 1
GRPO-VPS: Enhancing Group Relative Policy Optimization with Verifiable Process Supervision for Effective Reasoning Paper • 2604.20659 • Published Apr 22
LlamaSeg: Image Segmentation via Autoregressive Mask Generation Paper • 2505.19422 • Published May 26, 2025 • 3
VisNumBench: Evaluating Number Sense of Multimodal Large Language Models Paper • 2503.14939 • Published Mar 19, 2025 • 5