Inference Providers
Active filters: modelopt
decart-ai/Kimi-K2.7-Code-NVFP4
Text Generation
• Updated • 508
• 1
0xSero/GLM-5.2-504B-FullKD
292B • Updated • 114
• 1
nvidia/Llama-4-Scout-17B-16E-Instruct-NVFP4
56B • Updated • 95.3k
• 32
nvidia/Llama-4-Maverick-17B-128E-Instruct-FP8
402B • Updated • 588
• 15
nvidia/Llama-4-Scout-17B-16E-Instruct-FP8
109B • Updated • 328k
• 16
ishan24/test_modelopt_quant
Updated • 10
nvidia/Llama-4-Maverick-17B-128E-Eagle3
2B • Updated • 10
• 11
NVFP4/Qwen3-30B-A3B-Instruct-2507-FP4
Text Generation
• 16B • Updated • 2.8k
• 12
gesong2077/Qwen3-32B-NVFP4
19B • Updated • 2
• 1
54B • Updated • 6
nvidia/Phi-4-multimodal-instruct-NVFP4
4B • Updated • 906
• 12
nvidia/Phi-4-reasoning-plus-FP8
15B • Updated • 157
• 7
nvidia/Llama-3.1-8B-Instruct-NVFP4
5B • Updated • 241k
• 11
Text Generation
• 5B • Updated • 126k
• 17
Text Generation
• 8B • Updated • 5.04k
• 6
Text Generation
• 8B • Updated • 75.3k
• 12
Text Generation
• 15B • Updated • 11.1k
• 6
Text Generation
• 17B • Updated • 258k
• 17
nvidia/Qwen2.5-VL-7B-Instruct-FP8
Text Generation
• 8B • Updated • 1.46k
• 8
nvidia/Qwen2.5-VL-7B-Instruct-NVFP4
Text Generation
• 5B • Updated • 9.96k
• 16
nuphoto-ian/Qwen3-8B-QAT-NVFP4
5B • Updated • 4
txn545/Qwen3-Coder-30B-A3B-Instruct-NVFP4
16B • Updated • 100
• 1
shanjiaz/gpt-oss-120b-nvfp4-modelopt
59B • Updated • 616
• 4
shanjiaz/gpt-oss-20b-nvfp4-modelopt
11B • Updated • 26
• 1
nvidia/Llama-3.1-Nemotron-Nano-VL-8B-V1-FP4-QAD
Image-Text-to-Text
• 6B • Updated • 790
• 15
baseten-admin/glm-4.6-fp4
177B • Updated • 3
baseten-admin/glm-4.6-fp8
353B • Updated • 1
baseten-admin/glm-4.6-fp4-mlp
183B • Updated • 1
shinedays1993/Qwen3-30B-A3B-nvfp4
16B • Updated • 1
shinedays1993/Qwen3-32B-nvfp4
17B • Updated • 2