š® Mixture of Experts Collection MoE done using mergekit and LazyMergekit: https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb#scrollTo=d5mYzDo1q96y ⢠11 items ⢠Updated Mar 2 ⢠24
Qwen3 Voice Embedding Collection Standalone ECAPA-TDNN x-vector speaker encoders extracted from Qwen3-TTS. 1024-dim (0.6B) and 2048-dim (1.7B). ⢠4 items ⢠Updated Feb 27 ⢠29
view article Article GGML and llama.cpp join HF to ensure the long-term progress of Local AI +4 Feb 20 ⢠504
š§ LFM2 Collection LFM2 is a new generation of hybrid models, designed for on-device deployment. ⢠28 items ⢠Updated 19 days ago ⢠153
Multimodal GGUFs Collection Vision and audio models compatible with llama-server and llama-mtmd-cli ⢠16 items ⢠Updated Dec 18, 2025 ⢠20
Draft Models Collection Tiny "draft" models for speculative decoding. ⢠14 items ⢠Updated Mar 2 ⢠6
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics Paper ⢠2506.01844 ⢠Published Jun 2, 2025 ⢠158
š§ Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community ⢠24 items ⢠Updated May 19, 2025 ⢠188
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper ⢠2402.03300 ⢠Published Feb 5, 2024 ⢠145
Turning large language models into cognitive models Paper ⢠2306.03917 ⢠Published Jun 6, 2023 ⢠5
Unsloth Dynamic 2.0 Quants Collection New 2.0 version of our Dynamic GGUF + Quants. Dynamic 2.0 achieves superior accuracy & SOTA quantization performance. ⢠88 items ⢠Updated 5 days ago ⢠560
Granite Quantized Models Collection Quantized versions of IBM Granite models. ⢠44 items ⢠Updated 6 days ago ⢠33
Text-to-Speech (TTS) models Collection A collection of 4-bit, Dynamic 4-bit and 16-bit voice models including Sesame-CSM, OpenAI's Whisper, Orpheus. Fine-tune them with Unsloth now! ⢠16 items ⢠Updated 5 days ago ⢠28
Qwen3 Collection Qwen's new Qwen3 models. In Unsloth Dynamic 2.0, GGUF, 4-bit and 16-bit Safetensor formats. Includes 128K Context Length variants. ⢠70 items ⢠Updated 5 days ago ⢠271
Gemma 3 QAT Collection Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory ⢠15 items ⢠Updated Mar 12 ⢠218