Llama-3.2-3B — Text-to-SQL LoRA Adapter

A LoRA adapter that teaches Llama-3.2-3B-Instruct to generate SQL queries from natural-language questions against a given table schema.

Fine-tuned on b-mc2/sql-create-context with Unsloth on a free Google Colab T4 GPU.

What it does

Input: SQL schema (CREATE TABLE ...) + question in plain English Output: SQL query that answers the question

Examples

Example 1 — JOIN + aggregation + sorting + limit

Schema:

CREATE TABLE orders (id INT, user_id INT, amount DECIMAL, created_at DATE);
CREATE TABLE users (id INT, name VARCHAR, country VARCHAR);

Question: What are top 5 users by total order amount in 2025?

Model output:

SELECT T1.id, T1.name, T1.country, SUM(T2.amount) AS total_amount
FROM users AS T1 JOIN orders AS T2 ON T1.id = T2.user_id
WHERE T2.created_at >= 2025
GROUP BY T1.id ORDER BY total_amount DESC LIMIT 5

Example 2 — AVG with GROUP BY

Schema: CREATE TABLE employees (id INT, name VARCHAR, department VARCHAR, salary INT, hire_date DATE) Question: Show average salary by department, sorted from highest to lowest

SELECT AVG(salary), department FROM employees
GROUP BY department ORDER BY AVG(salary) DESC

Example 3 — WHERE with multiple conditions

Schema: CREATE TABLE products (id INT, name VARCHAR, category VARCHAR, price DECIMAL, stock INT) Question: Find all products with stock less than 10 in the 'electronics' category

SELECT * FROM products WHERE category = "electronics" AND stock < 10

Example 4 — COUNT with JOIN

Schema:

CREATE TABLE customers (id INT, name VARCHAR, country VARCHAR);
CREATE TABLE orders (id INT, customer_id INT, total DECIMAL);

Question: Count how many orders each customer from Ukraine has made

SELECT COUNT(*) FROM orders AS T1 JOIN customers AS T2
ON T1.customer_id = T2.id WHERE T2.country = "Ukraine"

Training details

Parameter	Value
Base model	`unsloth/Llama-3.2-3B-Instruct`
Method	LoRA (PEFT) via Unsloth
Dataset	`b-mc2/sql-create-context` — 2,000 examples (random subset, seed=42)
Trainable params	24.3M / 3.24B (0.75%)
LoRA config	`r=16, alpha=16, dropout=0`, target modules: Q/K/V/O + gate/up/down
Loss masking	`train_on_responses_only` — only SQL tokens contribute to loss
Optimizer	`adamw_8bit`, lr=2e-4, linear schedule, warmup 5 steps
Batch	per-device 2 × grad-accum 4 = effective 8
Steps	250 (= 1 epoch on 2,000 examples)
Hardware	Google Colab T4 (free) — 6.5 GB peak GPU memory (44% of 15 GB)
Training time	7.44 minutes
Loss trajectory	0.55 → 0.04 (final)

Evaluation

Evaluated on 100 held-out examples from b-mc2/sql-create-context (indices 2000–2100, unseen during training).

Metric	Value
Exact-match accuracy (normalized: lowercase, whitespace collapsed, quotes normalized)	76%

Notes on the failure mode

Manual analysis of the 24 "failures" shows that roughly half are semantically equivalent SQL that exact-match rejects:

Type of "failure"	Example
Strings vs numbers	`WHERE runs = 144` vs reference `WHERE runs = "144"`
Alias differences	`SELECT Prime` vs reference `SELECT Prime AS minister`
Ambiguous reference (dataset has minor errors)	`SELECT SUM(poles)` vs reference `SELECT MIN(poles)`
Genuine model errors (missing GROUP BY, wrong aggregation)	~10 out of 100

A proper execution accuracy evaluation (running predictions against a real database) would likely show 85–90% correctness. LLM-as-judge with Claude/GPT-4 is the next planned improvement.

Known limitations

English only. Trained on English questions only.
ANSI SQL. No specific dialect (PostgreSQL / MySQL / Oracle).
Naive date comparisons can appear (e.g. created_at >= 2025 instead of created_at >= '2025-01-01').
Implicit GROUP BY is occasionally missed when a question implies "per each X" without saying it.
Trained on the original dataset distribution — complex 3+ table joins or window functions are out of distribution.

How to use

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

base = "unsloth/Llama-3.2-3B-Instruct"
adapter = "notingemiu/llama-3.2-3b-text2sql-lora"

tokenizer = AutoTokenizer.from_pretrained(adapter)
model = AutoModelForCausalLM.from_pretrained(base, torch_dtype=torch.float16, device_map="auto")
model = PeftModel.from_pretrained(model, adapter)

schema = "CREATE TABLE orders (id INT, total DECIMAL, created_at DATE)"
question = "What is total revenue in 2024?"

messages = [
    {"role": "system", "content": f"You are a SQL expert. Use this schema:\n{schema}"},
    {"role": "user", "content": question},
]
inputs = tokenizer.apply_chat_template(
    messages, tokenize=True, add_generation_prompt=True, return_tensors="pt"
).to(model.device)
out = model.generate(input_ids=inputs, max_new_tokens=200, temperature=0.1, do_sample=True)
print(tokenizer.decode(out[0], skip_special_tokens=True))

Author

Built as part of a portfolio for AI/LLM Engineer roles. Comments and feedback welcome.

HuggingFace: @notingemiu

Downloads last month: 46

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for notingemiu/llama-3.2-3b-text2sql-lora

Base model

meta-llama/Llama-3.2-3B-Instruct

Finetuned

unsloth/Llama-3.2-3B-Instruct

Adapter

(411)

this model

notingemiu
/

llama-3.2-3b-text2sql-lora