lihaoxin2020/qwen3-4B-refiner-3201-rl-balanced-step50 Text Generation • 196k • Updated Apr 12 • 1 • 1
lihaoxin2020/qwen3-4B-refiner-sft-rl-balanced-resume-step100 Text Generation • 196k • Updated Apr 14 • 4
lihaoxin2020/qwen3-4b-refiner-gpt54-instance-rubric-gpt54-grpo-step50 Text Generation • 196k • Updated Apr 20 • 3