MiniCPM5-1B NVFP4

NVFP4 weight-only quantization of openbmb/MiniCPM5-1B using llm-compressor / compressed-tensors.

Best runtime target: Blackwell-class NVIDIA GPUs with vLLM support for NVFP4.

Downloads last month
94
Safetensors
Model size
0.8B params
Tensor type
F32
BF16
F8_E4M3
U8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for Reza2kn/MiniCPM5-1B-NVFP4

Quantized
(19)
this model