Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

396

Base only

Active filters: compression

RedHatAI/sst2-distilbert-sparse-blog

Text Classification • Updated Mar 13, 2024 • 8 • 4

aoxo/kevin-token-compressor

Text Generation • Updated Sep 18, 2024 • 6

justuswill/UQDM

Updated Jun 18, 2025

kalsy/TEZip

Updated Jun 24, 2025

prompterminal/nanogpt-shakespeare-compressed

Text Generation • Updated Jul 21, 2025 • 5

prompterminal/nanogpt-enwik8-compressed

Text Generation • Updated Jul 21, 2025 • 3

prompterminal/nanogpt-enwik8-compressed-working

Text Generation • Updated Jul 21, 2025 • 7 • 1

haichaozhang/VQ-Token-llava-ov-0.5b

Video-Text-to-Text • 1B • Updated Sep 21, 2025 • 2 • 2

qep/qep-1bit-extreme

Text Generation • Updated Sep 8, 2025 • 36 • 15

swayamsingal/tencent-Hunyuan-MT-7B-light-nanoquant-light

8B • Updated Sep 3, 2025 • 3 • 1

swayamsingal/tencent-Hunyuan-MT-7B-medium-nanoquant-medium

8B • Updated Sep 3, 2025 • 3

kyne0127/Qwen3-30B-A3B-TopK4-Compressed

Text Generation • 31B • Updated Sep 22, 2025 • 2

mradermacher/Qwen3-30B-A3B-TopK4-Compressed-GGUF

31B • Updated Sep 28, 2025 • 38

mradermacher/Qwen3-30B-A3B-TopK4-Compressed-i1-GGUF

31B • Updated Dec 8, 2025 • 93

ggunio/B2NL-v6.1.2

Updated Sep 26, 2025 • 4

ggunio/B2NL-IntelligentTokenizer-v6.2.1

Updated Oct 7, 2025 • 6

cerebras/Qwen3-Coder-REAP-363B-A35B-FP8

Text Generation • Updated Oct 14, 2025 • 35 • 16

cerebras/Qwen3-Coder-REAP-246B-A35B-FP8

Text Generation • 246B • Updated Oct 14, 2025 • 148 • 22

huawei-csl/Qwen3-1.7B-3bit-SINQ

Text Generation • 0.5B • Updated Feb 2 • 9 • 7

huawei-csl/Qwen3-1.7B-3bit-ASINQ

Text Generation • 0.5B • Updated Feb 2 • 8 • 7

huawei-csl/Qwen3-14B-3bit-SINQ

Text Generation • 3B • Updated Feb 2 • 16 • 5

huawei-csl/Qwen3-14B-3bit-ASINQ

Text Generation • 3B • Updated Feb 2 • 6 • 5

huawei-csl/Qwen3-32B-3bit-SINQ

Text Generation • 6B • Updated Feb 2 • 7 • 6

huawei-csl/Qwen3-32B-3bit-ASINQ

Text Generation • 6B • Updated Feb 2 • 5 • 5

huawei-csl/Qwen3-1.7B-4bit-SINQ

Text Generation • 1B • Updated Feb 2 • 6 • 5

huawei-csl/Qwen3-1.7B-4bit-ASINQ

Text Generation • 1B • Updated Feb 2 • 6 • 5

huawei-csl/Qwen3-32B-4bit-SINQ

Text Generation • 18B • Updated Feb 2 • 6 • 7

huawei-csl/Qwen3-14B-4bit-SINQ

Text Generation • 9B • Updated Feb 2 • 5 • 5

huawei-csl/Qwen3-14B-4bit-ASINQ

Text Generation • 9B • Updated Feb 2 • 8 • 6

huawei-csl/Qwen3-32B-4bit-ASINQ

Text Generation • 18B • Updated Feb 2 • 6 • 8