Instructions to use Sriram-214/nodejs-coder-qwen25 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Sriram-214/nodejs-coder-qwen25 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Sriram-214/nodejs-coder-qwen25") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Sriram-214/nodejs-coder-qwen25", dtype="auto") - llama-cpp-python
How to use Sriram-214/nodejs-coder-qwen25 with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="Sriram-214/nodejs-coder-qwen25", filename="nodejs-coder-Q4_K_M.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use Sriram-214/nodejs-coder-qwen25 with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Sriram-214/nodejs-coder-qwen25:Q4_K_M # Run inference directly in the terminal: llama-cli -hf Sriram-214/nodejs-coder-qwen25:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Sriram-214/nodejs-coder-qwen25:Q4_K_M # Run inference directly in the terminal: llama-cli -hf Sriram-214/nodejs-coder-qwen25:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf Sriram-214/nodejs-coder-qwen25:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf Sriram-214/nodejs-coder-qwen25:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf Sriram-214/nodejs-coder-qwen25:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf Sriram-214/nodejs-coder-qwen25:Q4_K_M
Use Docker
docker model run hf.co/Sriram-214/nodejs-coder-qwen25:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use Sriram-214/nodejs-coder-qwen25 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Sriram-214/nodejs-coder-qwen25" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Sriram-214/nodejs-coder-qwen25", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Sriram-214/nodejs-coder-qwen25:Q4_K_M
- SGLang
How to use Sriram-214/nodejs-coder-qwen25 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Sriram-214/nodejs-coder-qwen25" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Sriram-214/nodejs-coder-qwen25", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Sriram-214/nodejs-coder-qwen25" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Sriram-214/nodejs-coder-qwen25", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Ollama
How to use Sriram-214/nodejs-coder-qwen25 with Ollama:
ollama run hf.co/Sriram-214/nodejs-coder-qwen25:Q4_K_M
- Unsloth Studio
How to use Sriram-214/nodejs-coder-qwen25 with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Sriram-214/nodejs-coder-qwen25 to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Sriram-214/nodejs-coder-qwen25 to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Sriram-214/nodejs-coder-qwen25 to start chatting
- Pi
How to use Sriram-214/nodejs-coder-qwen25 with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf Sriram-214/nodejs-coder-qwen25:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "Sriram-214/nodejs-coder-qwen25:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use Sriram-214/nodejs-coder-qwen25 with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf Sriram-214/nodejs-coder-qwen25:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default Sriram-214/nodejs-coder-qwen25:Q4_K_M
Run Hermes
hermes
- Docker Model Runner
How to use Sriram-214/nodejs-coder-qwen25 with Docker Model Runner:
docker model run hf.co/Sriram-214/nodejs-coder-qwen25:Q4_K_M
- Lemonade
How to use Sriram-214/nodejs-coder-qwen25 with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull Sriram-214/nodejs-coder-qwen25:Q4_K_M
Run and chat with the model
lemonade run user.nodejs-coder-qwen25-Q4_K_M
List all available models
lemonade list
🚀 nodejs-coder-qwen25
A fine-tuned Qwen2.5-Coder-7B-Instruct model specialized for Node.js backend development, trained with LoRA adapters using Unsloth, merged into a single GGUF file for efficient local inference with Ollama.
🧠 Model Description
| Property | Details |
|---|---|
| Base Model | Qwen2.5-Coder-7B-Instruct |
| Fine-tuning Method | LoRA (Low-Rank Adaptation) via Unsloth |
| Training Framework | TRL SFTTrainer |
| Quantization | GGUF Q4_K_M (~4.4 GB) |
| Context Length | 2048 tokens |
| Language | JavaScript / Node.js |
This model is specifically trained to write clean, production-ready Node.js backend code. It understands common backend patterns including REST APIs, database integrations, authentication, and testing.
🎯 Specialties
- ✅ Express.js — REST APIs, middleware, routing
- ✅ NestJS — modules, controllers, services, guards
- ✅ Sequelize / Prisma — ORM models, migrations, queries
- ✅ MongoDB / Mongoose — schemas, models, aggregations
- ✅ PostgreSQL / pg — raw queries, connection pooling
- ✅ JWT Authentication — login, token generation, guards
- ✅ Jest — unit tests, mocking, integration tests
- ✅ Async/Await — file I/O, error handling, promises
⚡ Quick Start with Ollama
Step 1 — Download files
hf download Sriram-214/nodejs-coder-qwen25 nodejs-coder-Q4_K_M.gguf --local-dir ./
hf download Sriram-214/nodejs-coder-qwen25 Modelfile --local-dir ./
Step 2 — Create Ollama model
ollama create nodejs-coder -f Modelfile
Step 3 — Run
ollama run nodejs-coder
📋 Modelfile
FROM ./nodejs-coder-Q4_K_M.gguf
PARAMETER temperature 0.7
PARAMETER top_p 0.9
PARAMETER repeat_penalty 1.1
PARAMETER num_ctx 2048
PARAMETER num_gpu 0
PARAMETER stop "<|im_end|>"
PARAMETER stop "<|im_start|>"
TEMPLATE """<|im_start|>system
{{ .System }}<|im_end|>
<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
"""
SYSTEM """You are a Node.js backend development expert. You write clean, efficient, production-ready code following best practices. You specialize in Express.js, NestJS, Sequelize, Prisma, MongoDB, PostgreSQL, and Jest testing. Always include error handling and follow RESTful API design principles."""
Note: Set
num_gpu 0for CPU-only mode on GPUs with ≤6GB VRAM. For GPUs with more VRAM, increase this value.
💬 Example Prompts & Outputs
Express REST API
Prompt: Write a simple Express.js REST API with a GET /health endpoint
const express = require('express');
const app = express();
app.get('/health', (req, res) => {
res.status(200).json({ status: 'ok' });
});
app.listen(3000, () => console.log('Server running on port 3000'));
Async File Reading
Prompt: Write a Node.js function that reads a JSON file using async/await
const fs = require('fs').promises;
async function readJsonFile(filePath) {
try {
const data = await fs.readFile(filePath, 'utf-8');
return JSON.parse(data);
} catch (error) {
throw new Error(`Failed to read file: ${error.message}`);
}
}
List Directory Files
Prompt: Write a simple Node.js code to list all the files in a directory
const fs = require('fs');
fs.readdir('./', (err, files) => {
if (err) throw err;
console.log(files);
});
🏋️ Training Details
| Parameter | Value |
|---|---|
| Base Model | unsloth/Qwen2.5-Coder-7B-Instruct-bnb-4bit |
| LoRA Rank | 16 |
| LoRA Alpha | 32 |
| Training Data | Node.js backend code dataset |
| Training Framework | Unsloth + TRL SFTTrainer |
| Training Environment | Google Colab (T4 GPU) |
| Quantization | Q4_K_M via llama.cpp |
⚠️ Limitations
- Optimized for Node.js/JavaScript — not suited for other languages
- Context window of 2048 tokens — long files may be truncated
- CPU inference is slow (~3-5 tokens/sec on modern CPUs)
- May occasionally produce outdated library syntax
📄 License
Apache 2.0 — see LICENSE
- Downloads last month
- 34
4-bit