Instructions to use runjiazeng/Q-Bridge with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use runjiazeng/Q-Bridge with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-1.7B")
model = PeftModel.from_pretrained(base_model, "runjiazeng/Q-Bridge")

Transformers

How to use runjiazeng/Q-Bridge with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="runjiazeng/Q-Bridge")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("runjiazeng/Q-Bridge")
model = AutoModelForCausalLM.from_pretrained("runjiazeng/Q-Bridge")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use runjiazeng/Q-Bridge with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "runjiazeng/Q-Bridge"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "runjiazeng/Q-Bridge",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/runjiazeng/Q-Bridge

SGLang

How to use runjiazeng/Q-Bridge with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "runjiazeng/Q-Bridge" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "runjiazeng/Q-Bridge",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "runjiazeng/Q-Bridge" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "runjiazeng/Q-Bridge",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use runjiazeng/Q-Bridge with Docker Model Runner:
```
docker model run hf.co/runjiazeng/Q-Bridge
```

Q-Bridge

Model Details

Model type: LoRA fine-tuned causal language model for instruction following
Base model: Qwen/Qwen3-1.7B
Parameter-efficient method: Low-Rank Adaptation (LoRA) applied to transformer projection layers.
Libraries: Transformers, PEFT, Datasets, Weights & Biases logging.

Intended Use

The adapter specializes a base LLM to translate classical machine learning (CML) module descriptions into quantum machine learning (QML) implementations. Use it by loading the base model (Qwen/Qwen3-1.7B) and applying the LoRA weights through PEFT before prompting with CML descriptions.

Training Data

Source dataset: runjiazeng/CML-2-QML (train split only).
Filtering: examples whose reported average length exceeds half of the max_length argument are removed to stay within the tokenizer context window.

Prompt template:

You are an expert quantum machine learning researcher. Translate the provided classical machine learning (CML) description into its quantum machine learning (QML) counterpart.

CML Description:
<cml_text>

QML Solution:

Targets: Ground-truth QML solutions appended after the prompt.

Training Procedure

Tokenization: Uses the base model tokenizer with right padding and EOS padding when no explicit pad token exists. Labels for prompt tokens are masked with -100 to ensure loss is computed only on generated answers.
Batching: Custom data collator pads inputs dynamically and aligns masked labels.
Hardware setup: Script detects distributed settings via LOCAL_RANK/WORLD_SIZE and optionally enables DeepSpeed ZeRO-3.
Optimization:
- Learning rate default 2e-5 with cosine schedule and 0.03 warmup ratio.
- AdamW optimizer with weight_decay=0.1, max_grad_norm=1.0.
- Gradient accumulation steps default to 8, per-device batch size 1.
- Training runs for 3 epochs with gradient checkpointing enabled.
LoRA configuration: rank 64, alpha 128, dropout 0.05, bias disabled. Target modules default to gate_proj, down_proj, up_proj if present; otherwise all linear layers except lm_head.
Logging & checkpoints: Weights & Biases run configured via CLI arguments; checkpoints saved every 500 steps with a cap of 2.

Evaluation

No automatic evaluation metrics are computed in the training script. Users should validate generations on held-out CML descriptions.

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("runjiazeng/Q-Bridge", torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained("runjiazeng/Q-Bridge", use_fast=False)

prompt = """You are an expert quantum machine learning researcher. Translate the provided classical machine learning (CML) description into its quantum machine learning (QML) counterpart.

CML Description:
<your description>

QML Solution:"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1024)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Environmental Impact

The script supports DeepSpeed ZeRO-3 and gradient checkpointing to reduce memory consumption. Exact training footprint depends on the user's hardware and run duration.

Risks and Limitations

The model inherits biases from the base Qwen3-1.7B model.
Generated QML code may be unverified or non-executable. Users must review outputs before deployment.
Dataset focuses on pairwise ML→QML translations; performance on unrelated tasks is likely poor.

Training Script

The full training procedure, CLI, and data processing logic are provided in q-bridge-lora.py within this repository.

Downloads last month: -

Safetensors

Model size

2B params

Tensor type

F32

Model tree for runjiazeng/Q-Bridge

Base model

Qwen/Qwen3-1.7B-Base

Finetuned

Qwen/Qwen3-1.7B

Adapter

(504)

this model