Instructions to use kishan51/llm-zero-lite-experiments with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use kishan51/llm-zero-lite-experiments with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
LLM-Zero-Lite Experiments
A controlled comparison of continuous GRPO, fixed staged GRPO, and an
LLM-controlled staged GRPO schedule on three-number Countdown using
Qwen/Qwen3-1.7B with LoRA.
Final 1,000-step results
| Method | Greedy accuracy | Sampled pass@1 | Sampled pass@4 |
|---|---|---|---|
| Continuous GRPO | 26.5% | 31.0% | 35.5% |
| Fixed staged GRPO | 34.5% | 34.5% | 39.5% |
| LLM controller | 36.5% | 37.5% | 40.5% |
The runs/ directory contains metrics, evaluation samples, configuration
history, controller decisions, logs, plots, and all saved LoRA checkpoints.
- Downloads last month
- -