MM-JudgeBias: A Benchmark for Evaluating Compositional Biases in MLLM-as-a-Judge Paper • 2604.18164 • Published 8 days ago • 4
MMRefine: Unveiling the Obstacles to Robust Refinement in Multimodal Large Language Models Paper • 2506.04688 • Published Jun 5, 2025 • 3
MM-JudgeBias: A Benchmark for Evaluating Compositional Biases in MLLM-as-a-Judge Paper • 2604.18164 • Published 8 days ago • 4