minimax-m2.7

vllm serve callgg/minimax-m2.7 \
    --host 0.0.0.0 --port 8000 \
    --served-model-name minimax-m2.7 \
    --trust-remote-code \
    --max-model-len 75776 \
    --gpu-memory-utilization 0.85 \
    --kv-cache-dtype fp8 \
    --load-format fastsafetensors \
    --enable-auto-tool-choice \
    --tool-call-parser minimax_m2 \
    --reasoning-parser minimax_m2_append_think
Downloads last month
47
Safetensors
Model size
24B params
Tensor type
I32
·
F16
·
F32
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support