YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

GPT-2.5-Code

GPT-2.5-Code is a 0.5B parameter causal language model specialized for Python coding tasks. It was created by fine-tuning the BikoRiko/GPT-2.5-Math base model.

Model Details

  • Developed by: BikoRiko
  • Model type: GPT-2 architecture
  • Language(s): Python, English
  • License: MIT
  • Fine-tuned from model: BikoRiko/GPT-2.5-Math

Training Procedure

  • Infrastructure: NVIDIA H100 GPU via Modal.com
  • Dataset: flytech/python-codes-25k (Specialized subset)
  • Epochs: 5
  • Learning Rate: 1e-4 (Cosine schedule)
  • Batch Size: 4 with Gradient Accumulation (4 steps)

Intended Use

This model is designed for short-form Python code generation and completion tasks. It has been transitioned from a mathematical reasoning focus to a structured programming focus.

How to use

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("BikoRiko/GPT-2.5-Code")
model = AutoModelForCausalLM.from_pretrained("BikoRiko/GPT-2.5-Code")

instruction = "Write a python function to calculate the area of a circle."
prompt = f"Instruction: {instruction}
Output:"
inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downloads last month
-
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support