NVFP4 weight-only quantization of openbmb/MiniCPM5-1B using llm-compressor / compressed-tensors.
openbmb/MiniCPM5-1B
llm-compressor
compressed-tensors
Best runtime target: Blackwell-class NVIDIA GPUs with vLLM support for NVFP4.
Chat template
Files info
Base model