Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning Paper • 2606.04923 • Published 3 days ago • 37
Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning Paper • 2606.04923 • Published 3 days ago • 37
Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning Paper • 2606.04923 • Published 3 days ago • 37
Echoes as Anchors: Probabilistic Costs and Attention Refocusing in LLM Reasoning Paper • 2602.06600 • Published Feb 6 • 3
Echoes as Anchors: Probabilistic Costs and Attention Refocusing in LLM Reasoning Paper • 2602.06600 • Published Feb 6 • 3
Echoes as Anchors: Probabilistic Costs and Attention Refocusing in LLM Reasoning Paper • 2602.06600 • Published Feb 6 • 3