Qwen3.5-9B LoRA SFT distillation: R7 (86.8% eval) + R8 calibration. Datasets, FP16 checkpoints, and pipeline docs.
lee
cudabenchmarktest
AI & ML interests
Finetuning small language models, maintaining quality chain of thought, refusal and abliteration, along with novel reasoning distillation techniques. When you are wrestling for possession of a sword, the man with the handle always wins.
Recent Activity
updated a dataset 2 days ago
cudabenchmarktest/r8b-tool-sft updated a dataset 2 days ago
cudabenchmarktest/r8-eval-suite-5bucket updated a dataset 2 days ago
cudabenchmarktest/r8-thinking-fix-sftOrganizations
None yet