PRISM: A Multi-Dimensional Benchmark for Evaluating LLM Peer Reviewers Paper • 2605.26730 • Published 23 days ago • 16
Off-the-Shelf LLMs as Process Scorers: Training-Free Alternative to PRMs for Mathematical Reasoning Paper • 2606.01682 • Published 18 days ago • 7
LLMs Can Leak Training Data But Do They Want To? A Propensity-Aware Evaluation of Memorization in LLMs Paper • 2606.06286 • Published 15 days ago • 8
Skip a Layer or Loop It? Learning Program-of-Layers in LLMs Paper • 2606.06574 • Published 15 days ago • 22