Hao Zhuoyuan 郝卓远's picture

Hao Zhuoyuan 郝卓远

larry2210

·

https://github.com/hhh2210

hzy2210

AI & ML interests

None yet

Recent Activity

authored a paper 1 day ago

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

upvoted a paper 1 day ago

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

submitted a paper 1 day ago

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

View all activity

Organizations

None yet

authored a paper 1 day ago

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

Paper • 2606.04923 • Published 3 days ago • 37

upvoted a paper 1 day ago

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

Paper • 2606.04923 • Published 3 days ago • 37

submitted a paper to Daily Papers 1 day ago

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

Paper • 2606.04923 • Published 3 days ago • 37

New activity in MiniMaxAI/role-play-bench 15 days ago

What is the prompt used when using LLM-as-a-judge?

#2 opened 4 months ago by

submitted a paper to Daily Papers 4 months ago

Echoes as Anchors: Probabilistic Costs and Attention Refocusing in LLM Reasoning

Paper • 2602.06600 • Published Feb 6 • 3

authored a paper 4 months ago

Echoes as Anchors: Probabilistic Costs and Attention Refocusing in LLM Reasoning

Paper • 2602.06600 • Published Feb 6 • 3

upvoted a paper 4 months ago

Echoes as Anchors: Probabilistic Costs and Attention Refocusing in LLM Reasoning

Paper • 2602.06600 • Published Feb 6 • 3