RomanTeucher/text2cypher-curated
Viewer โข Updated โข 1.13k โข 219
How to use kv-rane/text2cypher-smollm2 with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("HuggingFaceTB/SmolLM2-135M-Instruct")
model = PeftModel.from_pretrained(base_model, "kv-rane/text2cypher-smollm2")A fine-tuned version of SmolLM2-135M-Instruct that generates Cypher queries
from natural language questions and a graph schema.
Evaluated on 50 test examples:
Token F1 improved 64% when scaling from 200 to 1000 training examples (16.7% โ 27.4%), confirming genuine learning.
Note - 200 sample training was done just to check cpu performance and loss behaviour
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base_model = AutoModelForCausalLM.from_pretrained(
"HuggingFaceTB/SmolLM2-135M-Instruct"
)
model = PeftModel.from_pretrained(base_model, "kv-rane/text2cypher-smollm2")
tokenizer = AutoTokenizer.from_pretrained("kv-rane/text2cypher-smollm2")
tokenizer.pad_token = tokenizer.eos_token
schema = "Movie {title, year}, Person {name}, (Person)-[:DIRECTED]->(Movie)"
question = "Which movies did Christopher Nolan direct before 2010?"
prompt = f"""### Schema:
{schema}
### Question:
{question}
### Cypher:
"""
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=128, do_sample=False)
generated = outputs[0][inputs["input_ids"].shape[1]:]
print(tokenizer.decode(generated, skip_special_tokens=True))
This model uses LoRA instead of full fine-tuning:
Base model
HuggingFaceTB/SmolLM2-135M