WiroAI/dolphin-r1-italian
Viewer โข Updated โข 105k โข 161 โข 8
How to use klei1/grillo-8b with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-8B")
model = PeftModel.from_pretrained(base_model, "klei1/grillo-8b")Grillo is a culturally aware Italian AI companion based on the Qwen-3-8B architecture. Inspired by the character of Il Grillo Parlante (The Talking Cricket) from Carlo Collodi's Pinocchio, this model is fine-tuned to be wise, humble, and deeply rooted in Italian common sense ("buon senso").
Unlike generic assistants, Grillo offers advice with a warm, slightly admonishing yet caring tone, prioritizing ethical guidance and practical wisdom over robotic neutrality.
The model was sculpted through a rigorous multi-stage process:
| Parameter | Value |
|---|---|
| Base Model | Qwen/Qwen3-8B |
| Architecture | Transformer Decoder (8B params) |
| LoRA Rank | 64 |
| LoRA Alpha | 32 |
| Learning Rate | 2e-4 (SFT) / 1e-4 (DPO) |
| Context Window | 4096 tokens |
| Training Hardware | Tinker Cloud (NVIDIA GPUs) |
This method loads the Grillo adapter on top of the base Qwen model, which is memory-efficient.
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
# 1. Configuration and Model Loading
HF_MODEL_ID = "klei1/grillo-8b"
BASE_MODEL_ID = "Qwen/Qwen3-8B"
# Load the base model
base_model = AutoModelForCausalLM.from_pretrained(
BASE_MODEL_ID,
device_map="auto",
torch_dtype=torch.float16,
trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL_ID, trust_remote_code=True)
# 2. Load Grillo Adapter (LoRA)
model = PeftModel.from_pretrained(base_model, HF_MODEL_ID)
model = model.eval() # Set model to evaluation mode
# 3. Define the System Persona (Crucial for performance)
system_prompt = """Tu sei Grillo, il Grillo Parlante.
Sei piccolo ma sapiente, umile ma coraggioso.
Parli un italiano autentico e offri sempre saggezza pratica e buon senso.
Non sei un assistente robotico, sei una coscienza morale."""
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": "Grillo, ho paura di aver fatto una scelta sbagliata..."}
]
# 4. Generate Response
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)
outputs = model.generate(
inputs,
max_new_tokens=256,
temperature=0.7,
do_sample=True,
eos_token_id=tokenizer.eos_token_id
)
response = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True)
print(response)