Gemma 4 Assistant GGUF Collection Gemma 4 MTP assistant drafters as GGUF (F16/Q8_0/Q5_K_M/Q4_K_M/Q4_K_S). Speculative-decoding heads for the atomic-llama-cpp-turboquant fork. • 4 items • Updated 22 days ago • 11
view article Article Google releases Gemma 2 2B, ShieldGemma and Gemma Scope +2 Xenova, pcuenq, reach-vb, joaogante • Jul 31, 2024 • 60
view article Article Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA +3 ybelkada, timdettmers, artidoro, sgugger, smangrul • May 24, 2023 • 180
view article Article Parameter-Efficient Fine-Tuning using 🤗 PEFT smangrul, sayakpaul • Feb 10, 2023 • 119
view article Article Fine-Tuning Gemma Models in Hugging Face +2 svaibhav, alanwaketan, ybelkada, ArthurZ • Feb 23, 2024 • 46
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters Paper • 2408.03314 • Published Aug 6, 2024 • 67