JiRack Coder Reasoning 32B INT4
A fast and efficient coding assistant with a clean built-in web UI, powered by Qwen3.0-Coder-32B-Instruct and optimized using Microsoft ONNX Runtime.
- JiRack is a cloud-ready model that helps save money on cloud infrastructure. It can be used as an expert model in RAG deployments, with the ONNX JiRack Java server as an alternative.
- Subscription: $1 per month per user (updated license for non-company use).
- Corp Subscription: $3 per month per user (updated license for company use).
- It works without subscription but send message about subscription
Quick Start
Watch JiRack Coder Reasoning 32B in action:
DEMO: JiRack Coder Reasoning 32B Web UI
Run with Docker
Default CPU
docker run -d \
--name jirack_coder_reasoning_32b \
-p 7869:7869 \
--restart unless-stopped \
cmsmanhattan/jirack_coder_32b_int4_qwenbase:latest
Multi CPU
docker run -d \
--name jirack_coder_reasoning_32b \
-p 7869:7869 \
--restart unless-stopped \
--memory=48g \
--cpus=16 \
cmsmanhattan/jirack_coder_32b_int4_qwenbase:latest
GPU (Coming soon)
docker run -d \
--name jirack_coder_reasoning_32b \
-p 7869:7869 \
--gpus all \
--restart unless-stopped \
cmsmanhattan/jirack_coder_32b_int4_gpu_qwenbase:latest
Docker Compose Example
services:
jirack:
image: cmsmanhattan/jirack_coder_32b_int4_qwenbase:latest
container_name: jirack_onnx_service
ports:
- "7869:7869"
volumes:
- .:/app
- ./web:/app/web
environment:
- MAX_TOKENS=1024
- TEMPERATURE=0.7
- TOP_P=0.9
- DEFAULT_STREAM=False
- INTRA_THREADS=4
- USE_ENV_ALLOCATOR=1
deploy:
resources:
limits:
memory: 48g
Access the UI
Once the container is running, open your browser and navigate to:
http://localhost:7869
This opens the JiRack Coder UI β a clean web interface designed for coding.
Changing the Port
The listening port can be easily modified directly from the Settings panel within the JiRack Coder UI.
Licensing
- The JiRack Coder Reasoning 32B model is provided under a commercial enterprise license.
- All JiRack UI clients are provided under a commercial license.
- However, the UI clients can be used for free when running together with the official JiRack Docker containers, as long as they are not redistributed separately.
JiRack Coder 14B is available under a lighter commercial license (approximately $12 per user per year).
For commercial licensing, cluster deployment, or enterprise use, please contact us.
JiRack MS Windows 11 Desktop Client (with Ollama API):
https://huggingface.co/kgrabko/JiRackTernary_1b/resolve/main/jirack-chat.zipLive email chat with the model: support@cmsmanhattan.com
Hardware Recommendations for AMD Systems
Note: This model is significantly heavier than JiRack Coder Reasoning 14B INT4.
Recommended Hardware for JiRack Coder Reasoning 32B INT4 (single Docker container)
| Use Case | CPU | GPU (ROCm) | VRAM / RAM | Expected Speed | Recommendation |
|---|---|---|---|---|---|
| Recommended | Ryzen 9 7950X / 9950X | RX 7900 XTX / 2x RX 7900 XT | 48GB+ VRAM | 35-55 tokens/s | Best choice |
| High Performance | Ryzen 9 9950X / Threadripper | 2x RX 7900 XTX | 48-64GB VRAM | 50-75 tokens/s | Excellent |
| Enterprise | EPYC 7003/9004 series | MI300X or 4x RX 7900 XTX | 96GB+ VRAM | 70-110 tokens/s | Best for production |
| Budget Option | Ryzen 7 7700 / 9700X | RX 7900 XTX (24GB) | 24GB+ VRAM | 25-40 tokens/s | Acceptable |
Important Memory Notes
Even though the 32B INT4 model itself takes approximately 12β14 GB, we recommend at least 48GB VRAM for the following reasons:
- KV-cache consumption during generation, especially with long context
- ONNX Runtime overhead and temporary buffers
- System stability and avoiding out-of-memory errors
- Room for larger context windows
Minimum recommended: 48GB VRAM (dual RX 7900 series or MI300X)
Ideal: 48β64GB VRAM
For pure CPU inference (no GPU), we recommend at least 128GB system RAM (Ryzen 9 7950X/9950X or better).
I added the default model in full FP32 precision. This serves as the base for quantization, allowing us to find the optimal balance between model size and performance.
π§ Contact & Licensing
For joint venture opportunities, hardware integration, or licensing inquiries:
- Email: grabko@cmsmanhattan.com
- Phone: +1 (516) 777-0945
- Location: New York, USA