YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
GPT-2.5-Code
GPT-2.5-Code is a 0.5B parameter causal language model specialized for Python coding tasks. It was created by fine-tuning the BikoRiko/GPT-2.5-Math base model.
Model Details
- Developed by: BikoRiko
- Model type: GPT-2 architecture
- Language(s): Python, English
- License: MIT
- Fine-tuned from model: BikoRiko/GPT-2.5-Math
Training Procedure
- Infrastructure: NVIDIA H100 GPU via Modal.com
- Dataset:
flytech/python-codes-25k(Specialized subset) - Epochs: 5
- Learning Rate: 1e-4 (Cosine schedule)
- Batch Size: 4 with Gradient Accumulation (4 steps)
Intended Use
This model is designed for short-form Python code generation and completion tasks. It has been transitioned from a mathematical reasoning focus to a structured programming focus.
How to use
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("BikoRiko/GPT-2.5-Code")
model = AutoModelForCausalLM.from_pretrained("BikoRiko/GPT-2.5-Code")
instruction = "Write a python function to calculate the area of a circle."
prompt = f"Instruction: {instruction}
Output:"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support