PSFT+RL models
SII-Wenhong
wh-zhu
AI & ML interests
None yet
Recent Activity
new activity about 9 hours ago
wh-zhu/Qwen2.5-7B-PSFT-RL-DAPO-90:Add model card and metadata new activity about 9 hours ago
wh-zhu/qwen2.5-1.5B-longcot-reasoning-HPD:Add model card for HPD distilled Qwen2.5-1.5B updated a model 3 days ago
wh-zhu/qwen2.5-1.5B-longcot-reasoning-HPD