Instructions to use SutskeverFanBoy/papagan_1.3b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use SutskeverFanBoy/papagan_1.3b with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="SutskeverFanBoy/papagan_1.3b",
	filename="papagan-1.3b-f16.gguf",
)

output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use SutskeverFanBoy/papagan_1.3b with llama.cpp:

Install (macOS, Linux)

curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf SutskeverFanBoy/papagan_1.3b:F16
# Run inference directly in the terminal:
llama cli -hf SutskeverFanBoy/papagan_1.3b:F16

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf SutskeverFanBoy/papagan_1.3b:F16
# Run inference directly in the terminal:
llama cli -hf SutskeverFanBoy/papagan_1.3b:F16

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf SutskeverFanBoy/papagan_1.3b:F16
# Run inference directly in the terminal:
./llama-cli -hf SutskeverFanBoy/papagan_1.3b:F16

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf SutskeverFanBoy/papagan_1.3b:F16
# Run inference directly in the terminal:
./build/bin/llama-cli -hf SutskeverFanBoy/papagan_1.3b:F16

Use Docker

docker model run hf.co/SutskeverFanBoy/papagan_1.3b:F16

LM Studio
Jan

vLLM

How to use SutskeverFanBoy/papagan_1.3b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "SutskeverFanBoy/papagan_1.3b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SutskeverFanBoy/papagan_1.3b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/SutskeverFanBoy/papagan_1.3b:F16

Ollama
How to use SutskeverFanBoy/papagan_1.3b with Ollama:
```
ollama run hf.co/SutskeverFanBoy/papagan_1.3b:F16
```

Unsloth Studio

How to use SutskeverFanBoy/papagan_1.3b with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for SutskeverFanBoy/papagan_1.3b to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for SutskeverFanBoy/papagan_1.3b to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for SutskeverFanBoy/papagan_1.3b to start chatting

Atomic Chat new
Docker Model Runner
How to use SutskeverFanBoy/papagan_1.3b with Docker Model Runner:
```
docker model run hf.co/SutskeverFanBoy/papagan_1.3b:F16
```

Lemonade

How to use SutskeverFanBoy/papagan_1.3b with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull SutskeverFanBoy/papagan_1.3b:F16

Run and chat with the model

lemonade run user.papagan_1.3b-F16

List all available models

lemonade list

Papağan 1.3B

Sıfırdan eğitilmiş, 1.28 milyar parametreli, açık kaynak Türkçe dil modeli.

$60 bütçe, tek kişi, bir hafta. Tokenizer eğitiminden GGUF export'a kadar tüm pipeline açık kaynak.


Parametre	1.28B (1,283,887,360)
Mimari	Decoder-only Transformer (Llama-style)
Eğitim verisi	1B token, %100 Türkçe
Kaynaklar	mC4-TR + FineWeb-2-TR + Wikipedia-TR
Tokenizer	SentencePiece BPE, 32K vocab
Türkçe token verimliliği	4.5 chr/tok (Llama-2 TR: ~7.8, 1.7× verimli)
Pre-training	A100 80GB, ~75 saat, val_loss 4.80
SFT	LoRA r=16, ~4K instruction, val_loss 3.65
Maliyet	~$60 (Colab Pro+)

Bu Model Ne Yapar / Ne Yapamaz?

Yapabildiği:

Türkçe metin üretmek, cümle tamamlamak
Basit soru-cevap (%22 doğruluk)
Gramatikal olarak tutarlı Türkçe üretmek (%91)
Kısa tanımlar ve açıklamalar vermek

Yapamadığı:

Doğru factual bilgi vermek (tarihler, sayılar çoğunlukla yanlış)
Kod yazmak, matematik çözmek, mantık yürütmek
Uzun tutarlı metin üretmek (tekrar problemi)
İngilizce veya başka dilde çalışmak

Bu bir araştırma/eğitim projesidir. Production kullanımı için tasarlanmadı. LLM pipeline'ını öğrenmek ve Türkçe NLP araştırması için referans.

Benchmark Sonuçları

Metrik	Skor	Açıklama
QA Doğruluk	%22.2 (8/36)	Coğrafya, tarih, bilim, kültür, teknoloji soruları
Tekrar Skoru	0.204	0 = tekrar yok, 1 = tamamen tekrar
Tutarlılık	%90.6	Gramatikal doğruluk ve anlam bütünlüğü
Perplexity	37.7	Dil modelleme kalitesi (düşük = iyi)

Kategori detayı: Dil %100, Teknoloji %50, Kültür %17, Coğrafya %12, Tarih %12, Bilim %12.

Neden bu kadar düşük? 1B token ile eğitildi. Karşılaştırma: Hamza-xlarge (aynı boyut) 300B token, Llama-3 15T token kullandı. 300× daha az veriyle eğitilmiş bir modelden bu sonuçlar beklenen.

Kullanım

PyTorch ile (bu repo'nun model.py'si gerekir)

import torch
import sentencepiece as spm
from model import Papagan, PapaganConfig
from lora import apply_lora

device = torch.device("mps")  # veya "cuda" veya "cpu"
config = PapaganConfig()
model = Papagan(config)

# Checkpoint yükle
ckpt = torch.load("model.safetensors", map_location="cpu")
state = ckpt.get("model", ckpt)

# LoRA varsa uygula
if any("lora_" in k for k in state.keys()):
    model = apply_lora(model)

model.load_state_dict(state, strict=False)
model = model.to(device).eval()

# Tokenizer
sp = spm.SentencePieceProcessor()
sp.Load("tokenizer.model")

# Soru sor
prompt = "Yapay zeka nedir?"
ids = [sp.bos_id(), 17] + sp.Encode(prompt) + [18]  # 17=user, 18=assistant
ids_t = torch.tensor([ids], device=device)

with torch.no_grad():
    output = model.generate(ids_t, max_new_tokens=100, temperature=0.7, top_k=40)

print(sp.Decode(output[0].tolist()))

GGUF ile (llama.cpp)

papagan-1.3b-f16.gguf dosyasını indirin:

# llama.cpp ile çalıştır
./llama-cli -m papagan-1.3b-f16.gguf -p "Türkiye hakkında bilgi ver" -n 100

# Önce quantize et (4-bit, boyut 5GB → ~1.2GB)
./llama-quantize papagan-1.3b-f16.gguf papagan-q4_k_m.gguf Q4_K_M
./llama-cli -m papagan-q4_k_m.gguf -p "Merhaba" -n 100

Mimari Detaylar

Papağan 1.3B
├── Embedding (32000 × 2048)              Weight tying ile LM head = Embedding
├── 24× Transformer Block
│   ├── RMSNorm → Multi-Head Attention    16 head, head_dim=128
│   │   ├── Q/K/V/O Linear (bias=False)   RoPE (θ=10000) Q ve K'ya uygulanır
│   │   └── scaled_dot_product_attention  Causal mask, Flash Attention
│   └── RMSNorm → SwiGLU MLP
│       ├── Gate + Up (2048 → 5504)       SiLU(gate) × up
│       └── Down (5504 → 2048)
└── RMSNorm → LM Head (2048 → 32000)     Tied with embedding

Hiperparametre	Değer
Layers	24
Hidden size	2048
Attention heads	16
Head dimension	128
MLP intermediate	5504
Max sequence length	2048
Vocab size	32,000
Norm epsilon	1e-5
RoPE theta	10,000
Activation	SwiGLU
Norm	RMSNorm
Position encoding	RoPE

Tokenizer Karşılaştırma

"Türkiye'nin başkenti Ankara'dır" cümlesini tokenize edelim:

Tokenizer	Token sayısı	Fertility
Papağan BPE	7	4.5 chr/tok
Llama-2	~12	~2.7 chr/tok
GPT-4 (cl100k)	~10	~3.2 chr/tok

Türkçe'de Papağan tokenizer'ı %43 daha az token üretir. Bu doğrudan inference hızı ve maliyeti etkiler.

Eğitim Süreci

Veri Pipeline

Ham veri (56 GB) → Dil filtreleme → Dedup → Kalite skoru → Temiz veri (54 GB)
                                                              ↓
                                                    Tokenize (1B token, 4GB binary)

Pre-training Eğrisi

Step     0: loss ~10.0  (random init)
Step   500: loss ~6.5   (warmup bitti)
Step  5000: loss ~5.2   (hızlı düşüş)
Step 10000: loss ~4.9   (yavaşlama)
Step 15000: loss ~4.7   (final) → val_loss 4.80, ppl 121

SFT Eğrisi

Epoch 1: val_loss 3.81 → ppl 45
Epoch 2: val_loss 3.66 → ppl 39
Epoch 3: val_loss 3.65 → ppl 38  (best)

SFT'de LoRA kullanıldı: sadece 6.3M parametre (%0.49) eğitildi. Loss masking ile sadece assistant cevabı üzerinde loss hesaplandı.

Dosyalar

Dosya	Boyut	Açıklama
`model.safetensors`	4.95 GB	SFT modeli (LoRA merged, float32)
`papagan-1.3b-f16.gguf`	5.13 GB	GGUF format (llama.cpp uyumlu)
`tokenizer.model`	781 KB	SentencePiece BPE tokenizer
`config.json`	0.4 KB	Model konfigürasyonu
`tokenizer_config.json`	0.2 KB	Tokenizer metadata

Sınırlamalar ve Riskler

Factual doğruluk düşük: Tarih, bilim, coğrafya sorularında çoğunlukla yanlış cevap verir
Halüsinasyon: Gerçek olmayan bilgi üretebilir
Güvenlik eğitimi yok: RLHF/DPO yapılmadı, zararlı içerik filtreleme yok
Bias: Eğitim verisindeki önyargıları yansıtabilir
Tekrar: Uzun çıktılarda kelime/cümle tekrarı oluşabilir
Sadece Türkçe: Başka dilleri desteklemez

Bu model research/educational amaçlıdır. Kritik kararlar için kullanılmamalıdır.

Alıntılama

@misc{papagan2026,
  title={Papağan 1.3B: A Turkish Language Model Trained from Scratch},
  author={Ercan Holasoglu},
  year={2026},
  url={https://github.com/ercanholasoglu/papagan}
}

Kaynak Kod

Tüm eğitim pipeline'ı (tokenizer eğitimi, veri hazırlama, pre-training, SFT, benchmark, conversion) açık kaynak:

github.com/ercanholasoglu/papagan

Lisans

Apache 2.0

Downloads last month: 668

Safetensors

Model size

1B params

Tensor type

F32

Evaluation results

QA Accuracy (36 questions)
self-reported

22.200
Perplexity
self-reported

37.700