Chanuk Lee
tally0818
AI & ML interests
LLM post-training
Recent Activity
upvoted a paper 1 day ago
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information upvoted a paper 2 days ago
The Unlearnability Phenomenon in RLVR for Language Models upvoted a paper 2 days ago
You Only Need Minimal RLVR Training: Extrapolating LLMs via Rank-1 TrajectoriesOrganizations
None yet