Same Claim, Different Judgment: Benchmarking Scenario-Induced Bias in Multilingual Financial Misinformation Detection Paper • 2601.05403 • Published Jan 8 • 11
LLM Safety From Within: Detecting Harmful Content with Internal Representations Paper • 2604.18519 • Published Apr 20 • 26
Beyond Message Passing: A Semantic View of Agent Communication Protocols Paper • 2604.02369 • Published Apr 13
MINER: Mining Multimodal Internal Representation for Efficient Retrieval Paper • 2605.06460 • Published 24 days ago • 3
CARE: Privacy-Compliant Agentic Reasoning with Evidence Discordance Paper • 2604.01113 • Published Apr 1 • 3
CARE: Privacy-Compliant Agentic Reasoning with Evidence Discordance Paper • 2604.01113 • Published Apr 1 • 3
MINER: Mining Multimodal Internal Representation for Efficient Retrieval Paper • 2605.06460 • Published 24 days ago • 3
QUACK: Questioning, Understanding, and Auditing Communicated Knowledge in Multimodal Social Deduction Agents Paper • 2605.27068 • Published 5 days ago • 18
FACTS: Table Summarization via Offline Template Generation with Agentic Workflows Paper • 2510.13920 • Published Oct 15, 2025 • 1
QUACK: Questioning, Understanding, and Auditing Communicated Knowledge in Multimodal Social Deduction Agents Paper • 2605.27068 • Published 5 days ago • 18
QUACK: Questioning, Understanding, and Auditing Communicated Knowledge in Multimodal Social Deduction Agents Paper • 2605.27068 • Published 5 days ago • 18
LLM Safety From Within: Detecting Harmful Content with Internal Representations Paper • 2604.18519 • Published Apr 20 • 26
Beyond Length Scaling: Synergizing Breadth and Depth for Generative Reward Models Paper • 2603.01571 • Published Mar 2 • 33
RubricBench: Aligning Model-Generated Rubrics with Human Standards Paper • 2603.01562 • Published Mar 2 • 63
Same Claim, Different Judgment: Benchmarking Scenario-Induced Bias in Multilingual Financial Misinformation Detection Paper • 2601.05403 • Published Jan 8 • 11