Instructions to use tenyyprn/qwen3-4b-structeval-exp13 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use tenyyprn/qwen3-4b-structeval-exp13 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="tenyyprn/qwen3-4b-structeval-exp13")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("tenyyprn/qwen3-4b-structeval-exp13", dtype="auto")

PEFT
How to use tenyyprn/qwen3-4b-structeval-exp13 with PEFT:
```
Task type is invalid.
```
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use tenyyprn/qwen3-4b-structeval-exp13 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "tenyyprn/qwen3-4b-structeval-exp13"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "tenyyprn/qwen3-4b-structeval-exp13",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/tenyyprn/qwen3-4b-structeval-exp13

SGLang

How to use tenyyprn/qwen3-4b-structeval-exp13 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "tenyyprn/qwen3-4b-structeval-exp13" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "tenyyprn/qwen3-4b-structeval-exp13",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "tenyyprn/qwen3-4b-structeval-exp13" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "tenyyprn/qwen3-4b-structeval-exp13",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use tenyyprn/qwen3-4b-structeval-exp13 with Docker Model Runner:
```
docker model run hf.co/tenyyprn/qwen3-4b-structeval-exp13
```

Qwen3-4B Structured Data Expert (Exp13 - DPO with System Prompt)

This model is a fine-tuned version of Qwen/Qwen3-4B-Instruct-2507 using Direct Preference Optimization (DPO).

This repository contains a LoRA adapter trained for structured data generation tasks (JSON, YAML, TOML, XML, CSV, etc.).

Key Feature

Training and inference formats are fully aligned by embedding the system prompt into DPO training data, which significantly improves output quality.

Training Configuration

Parameter	Value
Base model	Qwen/Qwen3-4B-Instruct-2507 + SFT (Exp5)
Method	DPO (Direct Preference Optimization)
Dataset	u-10bei/dpo-dataset-qwen-cot
LoRA rank (r)	16
LoRA alpha	32
Learning rate	5e-7
Epochs	2
Batch size	4 (grad accum: 2)
Beta	0.1
Max length	1024
Max prompt length	512
Optimizer	AdamW
Warmup ratio	0.1
Seed	3407

System Prompt (used at inference)

You are a structured data expert. Output the requested format directly without any explanation, preamble, or markdown code blocks. Do not write ```json, ```yaml, ```toml, ```xml, ```csv or similar. Output only the raw structured data.

Key Improvements over baseline

System prompt embedded in DPO training: Training and inference formats are fully consistent
Clean chosen responses: Only the structured data portion extracted (no code blocks, no preamble)
Code block suppression: 0% code block usage at inference (vs ~70% in base DPO)

Inference Example

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

BASE_MODEL_ID = "Qwen/Qwen3-4B-Instruct-2507"
ADAPTER_PATH = "tenyyprn/qwen3-4b-structeval-exp13"

SYSTEM_PROMPT = (
    "You are a structured data expert. "
    "Output the requested format directly without any explanation, "
    "preamble, or markdown code blocks. "
    "Do not write ```json, ```yaml, ```toml, ```xml, ```csv or similar. "
    "Output only the raw structured data."
)

tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL_ID, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(BASE_MODEL_ID, torch_dtype=torch.float16, device_map="auto")
model = PeftModel.from_pretrained(model, ADAPTER_PATH)
model = model.merge_and_unload()
model.eval()

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": "Convert to JSON: name=Alice, age=30, city=Tokyo"},
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=512, do_sample=False)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))

Citations

@inproceedings{rafailov2023direct,
    title        = {{Direct Preference Optimization: Your Language Model is Secretly a Reward Model}},
    author       = {Rafael Rafailov and Archit Sharma and Eric Mitchell and Christopher D. Manning and Stefano Ermon and Chelsea Finn},
    year         = 2023,
    booktitle    = {Advances in Neural Information Processing Systems 36},
    url          = {http://papers.nips.cc/paper_files/paper/2023/hash/a85b405ed65c6477a4fe8302b5e06ce7-Abstract-Conference.html},
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for tenyyprn/qwen3-4b-structeval-exp13

Base model

Qwen/Qwen3-4B-Instruct-2507

Adapter

(5496)

this model

tenyyprn
/

qwen3-4b-structeval-exp13