Software Last Updated: 2026-05-01

llamafile v0.10.*

This repository contains a few llamafiles built from our v0.10.1 release.

These llamafiles are pre-packaged with support for CPU, Metal GPUs on Mac and CUDA on linux. If you have a different GPU or OS, you can download the corresponding library in the Files and versions section and save it in your home directory for llamafile to automatically find it.

For more information about our project, check out our github repo. To learn how to use llamafiles, check our documentation!

Model	Size	License	llamafile
Bonsai-1.7B Q1_0	292 MB	Apache 2.0	Bonsai-1.7B.llamafile
Bonsai-4B Q1_0	616 MB	Apache 2.0	Bonsai-4B.llamafile
Bonsai-8B Q1_0	1.2 GB	Apache 2.0	Bonsai-8B.llamafile
Qwen3.5 0.8B Q8_0	1.6 GB	Apache 2.0	Qwen3.5-0.8B-Q8_0.llamafile
Qwen3.5 2B Q8_0	3.2 GB	Apache 2.0	Qwen3.5-2B-Q8_0.llamafile
Ministral 3 3B Instruct 2512 Q4_K_M	3.4 GB	Apache 2.0	Ministral-3-3B-Instruct-2512-Q4_K_M.llamafile
Qwen3.5 4B Q5_K_S	4.1 GB	Apache 2.0	Qwen3.5-4B-Q5_K_S.llamafile
gemma-4-E2B-it Q5_K_M	5.2 GB	Apache 2.0	gemma-4-E2B-it-Q5_K_M.llamafile
llava v1.6 mistral 7b Q4_K_M	5.3 GB	Apache 2.0	llava-v1.6-mistral-7b-Q4_K_M.llamafile
Apertus 8B Instruct 2509	5.9 GB	Apache 2.0	Apertus-8B-Instruct-2509.llamafile
gemma-4-E4B-it Q5_K_M	7.4 GB	Apache 2.0	gemma-4-E4B-it-Q5_K_M.llamafile
Qwen3.5 9B Q5_K_S	7.4 GB	Apache 2.0	Qwen3.5-9B-Q5_K_S.llamafile
Ministral 3 3B Instruct 2512 BF16	7.8 GB	Apache 2.0	Ministral-3-3B-Instruct-2512-BF16.llamafile
llava v1.6 mistral 7b Q8_0	8.4 GB	Apache 2.0	llava-v1.6-mistral-7b-Q8_0.llamafile
gpt-oss 20b mxfp4	12 GB	Apache 2.0	gpt-oss-20b-mxfp4.llamafile
gpt-oss 20b Q5_K_S	12 GB	Apache 2.0	gpt-oss-20b-Q5_K_S.llamafile
LFM2 24B A2B Q5_K_M	16 GB	lfm1.0	LFM2-24B-A2B-Q5_K_M.llamafile
gemma-4-26B-A4B MXFP4	19 GB	Apache 2.0	gemma-4-26B-A4B-it-MXFP4_MOE.llamafile
Qwen3.6 27B Q4_K_M	19 GB	Apache 2.0	Qwen3.6-27B-Q4_K_M.llamafile
gemma-4-31B-it Q5_K_M	24 GB	Apache 2.0	gemma-4-31B-it-Q5_K_M.llamafile

NOTE: While the llamafile project is Apache 2.0-licensed, the licenses of models we bundle with it might differ. Use the table above for reference.

Downloads last month: 8,520

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mozilla-ai/llamafile_0.10

Finetunes

1 model