Instructions to use notingemiu/llama-3.2-3b-text2sql-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use notingemiu/llama-3.2-3b-text2sql-lora with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/llama-3.2-3b-instruct-unsloth-bnb-4bit") model = PeftModel.from_pretrained(base_model, "notingemiu/llama-3.2-3b-text2sql-lora") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Unsloth Studio
How to use notingemiu/llama-3.2-3b-text2sql-lora with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for notingemiu/llama-3.2-3b-text2sql-lora to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for notingemiu/llama-3.2-3b-text2sql-lora to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for notingemiu/llama-3.2-3b-text2sql-lora to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="notingemiu/llama-3.2-3b-text2sql-lora", max_seq_length=2048, )
Llama-3.2-3B β Text-to-SQL LoRA Adapter
A LoRA adapter that teaches Llama-3.2-3B-Instruct to generate SQL queries from natural-language questions against a given table schema.
Fine-tuned on b-mc2/sql-create-context with Unsloth on a free Google Colab T4 GPU.
What it does
Input: SQL schema (CREATE TABLE ...) + question in plain English
Output: SQL query that answers the question
Examples
Example 1 β JOIN + aggregation + sorting + limit
Schema:
CREATE TABLE orders (id INT, user_id INT, amount DECIMAL, created_at DATE);
CREATE TABLE users (id INT, name VARCHAR, country VARCHAR);
Question: What are top 5 users by total order amount in 2025?
Model output:
SELECT T1.id, T1.name, T1.country, SUM(T2.amount) AS total_amount
FROM users AS T1 JOIN orders AS T2 ON T1.id = T2.user_id
WHERE T2.created_at >= 2025
GROUP BY T1.id ORDER BY total_amount DESC LIMIT 5
Example 2 β AVG with GROUP BY
Schema: CREATE TABLE employees (id INT, name VARCHAR, department VARCHAR, salary INT, hire_date DATE)
Question: Show average salary by department, sorted from highest to lowest
SELECT AVG(salary), department FROM employees
GROUP BY department ORDER BY AVG(salary) DESC
Example 3 β WHERE with multiple conditions
Schema: CREATE TABLE products (id INT, name VARCHAR, category VARCHAR, price DECIMAL, stock INT)
Question: Find all products with stock less than 10 in the 'electronics' category
SELECT * FROM products WHERE category = "electronics" AND stock < 10
Example 4 β COUNT with JOIN
Schema:
CREATE TABLE customers (id INT, name VARCHAR, country VARCHAR);
CREATE TABLE orders (id INT, customer_id INT, total DECIMAL);
Question: Count how many orders each customer from Ukraine has made
SELECT COUNT(*) FROM orders AS T1 JOIN customers AS T2
ON T1.customer_id = T2.id WHERE T2.country = "Ukraine"
Training details
| Parameter | Value |
|---|---|
| Base model | unsloth/Llama-3.2-3B-Instruct |
| Method | LoRA (PEFT) via Unsloth |
| Dataset | b-mc2/sql-create-context β 2,000 examples (random subset, seed=42) |
| Trainable params | 24.3M / 3.24B (0.75%) |
| LoRA config | r=16, alpha=16, dropout=0, target modules: Q/K/V/O + gate/up/down |
| Loss masking | train_on_responses_only β only SQL tokens contribute to loss |
| Optimizer | adamw_8bit, lr=2e-4, linear schedule, warmup 5 steps |
| Batch | per-device 2 Γ grad-accum 4 = effective 8 |
| Steps | 250 (= 1 epoch on 2,000 examples) |
| Hardware | Google Colab T4 (free) β 6.5 GB peak GPU memory (44% of 15 GB) |
| Training time | 7.44 minutes |
| Loss trajectory | 0.55 β 0.04 (final) |
Evaluation
Evaluated on 100 held-out examples from b-mc2/sql-create-context (indices 2000β2100, unseen during training).
| Metric | Value |
|---|---|
| Exact-match accuracy (normalized: lowercase, whitespace collapsed, quotes normalized) | 76% |
Notes on the failure mode
Manual analysis of the 24 "failures" shows that roughly half are semantically equivalent SQL that exact-match rejects:
| Type of "failure" | Example |
|---|---|
| Strings vs numbers | WHERE runs = 144 vs reference WHERE runs = "144" |
| Alias differences | SELECT Prime vs reference SELECT Prime AS minister |
| Ambiguous reference (dataset has minor errors) | SELECT SUM(poles) vs reference SELECT MIN(poles) |
| Genuine model errors (missing GROUP BY, wrong aggregation) | ~10 out of 100 |
A proper execution accuracy evaluation (running predictions against a real database) would likely show 85β90% correctness. LLM-as-judge with Claude/GPT-4 is the next planned improvement.
Known limitations
- English only. Trained on English questions only.
- ANSI SQL. No specific dialect (PostgreSQL / MySQL / Oracle).
- Naive date comparisons can appear (e.g.
created_at >= 2025instead ofcreated_at >= '2025-01-01'). - Implicit GROUP BY is occasionally missed when a question implies "per each X" without saying it.
- Trained on the original dataset distribution β complex 3+ table joins or window functions are out of distribution.
How to use
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
base = "unsloth/Llama-3.2-3B-Instruct"
adapter = "notingemiu/llama-3.2-3b-text2sql-lora"
tokenizer = AutoTokenizer.from_pretrained(adapter)
model = AutoModelForCausalLM.from_pretrained(base, torch_dtype=torch.float16, device_map="auto")
model = PeftModel.from_pretrained(model, adapter)
schema = "CREATE TABLE orders (id INT, total DECIMAL, created_at DATE)"
question = "What is total revenue in 2024?"
messages = [
{"role": "system", "content": f"You are a SQL expert. Use this schema:\n{schema}"},
{"role": "user", "content": question},
]
inputs = tokenizer.apply_chat_template(
messages, tokenize=True, add_generation_prompt=True, return_tensors="pt"
).to(model.device)
out = model.generate(input_ids=inputs, max_new_tokens=200, temperature=0.1, do_sample=True)
print(tokenizer.decode(out[0], skip_special_tokens=True))
Author
Built as part of a portfolio for AI/LLM Engineer roles. Comments and feedback welcome.
- HuggingFace: @notingemiu
- Downloads last month
- 46
Model tree for notingemiu/llama-3.2-3b-text2sql-lora
Base model
meta-llama/Llama-3.2-3B-Instruct