MM-JudgeBias: A Benchmark for Evaluating Compositional Biases in MLLM-as-a-Judge Paper • 2604.18164 • Published 10 days ago • 4
MMRefine: Unveiling the Obstacles to Robust Refinement in Multimodal Large Language Models Paper • 2506.04688 • Published Jun 5, 2025 • 3
MM-JudgeBias: A Benchmark for Evaluating Compositional Biases in MLLM-as-a-Judge Paper • 2604.18164 • Published 10 days ago • 4
MM-JudgeBias: A Benchmark for Evaluating Compositional Biases in MLLM-as-a-Judge Paper • 2604.18164 • Published 10 days ago • 4
CXReasonAgent: Evidence-Grounded Diagnostic Reasoning Agent for Chest X-rays Paper • 2602.23276 • Published Feb 26 • 16