GPT-2.5-Code

GPT-2.5-Code is a 0.5B parameter causal language model specialized for Python coding tasks. It was created by fine-tuning the BikoRiko/GPT-2.5-Math base model.

Model Details

Developed by: BikoRiko
Model type: GPT-2 architecture
Language(s): Python, English
License: MIT
Fine-tuned from model: BikoRiko/GPT-2.5-Math

Training Procedure

Infrastructure: NVIDIA H100 GPU via Modal.com
Dataset: flytech/python-codes-25k (Specialized subset)
Epochs: 5
Learning Rate: 1e-4 (Cosine schedule)
Batch Size: 4 with Gradient Accumulation (4 steps)

Intended Use

This model is designed for short-form Python code generation and completion tasks. It has been transitioned from a mathematical reasoning focus to a structured programming focus.

How to use

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("BikoRiko/GPT-2.5-Code")
model = AutoModelForCausalLM.from_pretrained("BikoRiko/GPT-2.5-Code")

instruction = "Write a python function to calculate the area of a circle."
prompt = f"Instruction: {instruction}
Output:"
inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Downloads last month: -

Safetensors

Model size

0.2B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support