view article Article We Got Claude to Fine-Tune an Open Source LLM burtenshaw, evalstate • Dec 4, 2025 • 630
view article Article Performant local mixture-of-experts CPU inference with GPU acceleration in llama.cpp Doctor-Shotgun • Jan 30 • 28
view article Article Small Language Models (SLM): A Comprehensive Overview jjokah • Feb 22, 2025 • 163
view article Article Scaling Real-Time Voice Agents with Cache-Aware Streaming ASR nvidia • Jan 5 • 88
view article Article The Great Classification Showdown: OSS vs BERT on Consumer Hardware BenTouss • Jan 26 • 12
Can LLMs Clean Up Your Mess? A Survey of Application-Ready Data Preparation with LLMs Paper • 2601.17058 • Published Jan 22 • 190
Scaling Embeddings Outperforms Scaling Experts in Language Models Paper • 2601.21204 • Published Jan 29 • 105
AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security Paper • 2601.18491 • Published Jan 26 • 126
view article Article We Got Claude to Build CUDA Kernels and teach open models! +2 burtenshaw, evalstate, merve, pcuenq • Jan 28 • 158
Load 4bit models 4x faster Collection Native bitsandbytes 4bit pre quantized models • 25 items • Updated 9 days ago • 62
Embedding Models Collection Run or fine-tune embedding models with Unsloth. • 14 items • Updated 9 days ago • 6