Instructions to use amkyawdev/amk-coder-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use amkyawdev/amk-coder-v2 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="amkyawdev/amk-coder-v2")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("amkyawdev/amk-coder-v2")
model = AutoModelForCausalLM.from_pretrained("amkyawdev/amk-coder-v2")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Local Apps Settings

vLLM

How to use amkyawdev/amk-coder-v2 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "amkyawdev/amk-coder-v2"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "amkyawdev/amk-coder-v2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/amkyawdev/amk-coder-v2

SGLang

How to use amkyawdev/amk-coder-v2 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "amkyawdev/amk-coder-v2" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "amkyawdev/amk-coder-v2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "amkyawdev/amk-coder-v2" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "amkyawdev/amk-coder-v2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use amkyawdev/amk-coder-v2 with Docker Model Runner:
```
docker model run hf.co/amkyawdev/amk-coder-v2
```

amk-coder-v2 / README.md

amkyawdev

Update README.md: add training details, fix AutoModelForCausalLM, update model overview

209dc94 verified 4 days ago

preview code

Raw

History Blame Contribute Delete

7.99 kB

metadata

pipeline_tag: text-generation
license: apache-2.0
tags:
  - code-generation
  - myanmar
  - burmese
  - qwen
  - qwen2
  - qwen2.5
  - qwen2.5-coder
  - transformers
  - conversational
  - text-generation
library_name: transformers
inference:
  parameters:
    max_new_tokens: 512
    temperature: 0.2
    top_p: 0.95
    repetition_penalty: 1.1
model-index:
  - name: amk-coder-v2
    results:
      - task:
          type: text-generation
          name: CodeGeneration
        dataset:
          name: HumanEval
          type: openai/openai_humaneval
        metrics:
          - type: pass_at_1
            value: 50
            verified: false
          - type: pass_at_10
            value: 75
            verified: false
      - task:
          type: text-generation
          name: PythonCodeGeneration
        dataset:
          name: MBPP
          type: abdshhayan/MBPP
        metrics:
          - type: pass_at_1
            value: 55
            verified: false

🤖 amk-coder-v2 — Myanmar Coding Agent

Myanmar Coding Assistant — Fine-tuned from Qwen2.5-Coder-1.5B using LoRA (PEFT)

Model Overview

amk-coder-v2 is a Myanmar-localized coding assistant fine-tuned from Qwen2.5-Coder-1.5B using LoRA (PEFT) technique.

Attribute	Value
Base Model	Qwen2.5-Coder-1.5B
Parameters	2B (2,000M)
Architecture	Qwen2ForCausalLM
Training Method	LoRA (PEFT) fine-tuning
Dataset	amkyawdev/mm-llm-coder-agent-dataset (4M rows)
Context Length	32,768 tokens
Format	Safetensors (BF16)
License	Apache-2.0
Languages	Burmese + English

Features

Feature	Description
🇲🇲 Myanmar Support	Full support for Myanmar Unicode text
💻 Code Generation	Python, JavaScript, C++, Java, and more
🐛 Debugging	Bug detection and fixes
📖 Code Explanation	Line-by-line explanations

Training Details

Parameter	Value
Framework	Transformers + PEFT
Training Method	LoRA fine-tuning
Target Modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Optimizer	paged_adamw_8bit
Learning Rate	3e-5
Epochs	3
Batch Size	8
Max Length	2048
Precision	FP16 mixed
Hardware	Kaggle Dual T4 GPU
Training Time	~3-5 hrs

Chat Template (ChatML)

<|im_start|>system
You are an expert Myanmar AI coding agent with tool access.<|im_end|>
<|im_start|>user
{Instruction}
Tools available: {Tools}<|im_end|>
<|im_start|>assistant
Thought & Code:

Quick Start

Using Transformers (Python)

# Method 1: Pipeline (Recommended for beginners)
from transformers import pipeline

pipe = pipeline("text-generation", model="amkyawdev/amk-coder-v2")
messages = [
    {"role": "user", "content": "Python function တစ်ခုရေးပါ။ list comprehension နဲ့ sorting လုပ်ပေးပါ။"}
]
result = pipe(messages, max_new_tokens=512, temperature=0.2)
print(result[0]['generated_text'])

# Method 2: Direct Model Loading
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tokenizer = AutoTokenizer.from_pretrained("amkyawdev/amk-coder-v2")
model = AutoModelForCausalLM.from_pretrained(
    "amkyawdev/amk-coder-v2",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

messages = [
    {"role": "system", "content": "You are a helpful coding assistant."},
    {"role": "user", "content": "Write a Python function to reverse a string"}
]

inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

outputs = model.generate(inputs, max_new_tokens=512, temperature=0.2)
response = tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True)
print(response)

Using vLLM (Production)

# Install vLLM
pip install vllm

# Start server
vllm serve "amkyawdev/amk-coder-v2" --tensor-parallel-size 1

# API call
curl -X POST "http://localhost:8000/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "amkyawdev/amk-coder-v2",
    "messages": [
      {"role": "user", "content": "Hello, write Python code"}
    ],
    "max_tokens": 512,
    "temperature": 0.2
  }'

Using SGLang

# Install SGLang
pip install sglang

# Start server
python -m sglang.launch_server --model-path "amkyawdev/amk-coder-v2" --port 30000

# API call
curl -X POST "http://localhost:30000/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "amkyawdev/amk-coder-v2",
    "messages": [{"role": "user", "content": "Write a hello world in Python"}]
  }'

Usage Examples

🇲🇲 Myanmar Prompts

messages = [
    {"role": "user", "content": "Python function တစ်ခုရေးပါ။ ဂဏန်းတွေကို sorting လုပ်ပေးပါ။"}
]
# Output: def sort_numbers(numbers): return sorted(numbers)

🇬🇧 English Prompts

messages = [
    {"role": "user", "content": "Explain this code:\nfor i in range(10):\n    print(i)"}
]
# Output: This is a for loop that prints numbers 0 to 9

🐛 Debugging

messages = [
    {"role": "user", "content": "Fix this Python code:\nprint('Hello' + 5)"}
]
# Output: TypeError fix suggestion with corrected code

API Deployment

Backend Server

cd backend
pip install -r requirements.txt
export HF_TOKEN=hf_your_token
uvicorn app.main:app --host 0.0.0.0 --port 8000

Endpoints

Method	Endpoint	Description
GET	`/`	Health check
GET	`/health`	Service health status
POST	`/chat`	Streaming chat (SSE)
GET	`/demo`	Demo HTML interface
GET	`/models`	Model information

Request Format

# Streaming chat
curl -X POST "http://localhost:8000/chat" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Write a Fibonacci function in Python"}
    ],
    "stream": true
  }'

Docker Deployment

# Using Docker Model Runner
docker model run hf.co/amkyawdev/amk-coder-v2

# Using vLLM Docker
docker run --gpus all \
  -v ~/.cache/huggingface:/root/.cache/huggingface \
  -p 8000:8000 \
  --rm \
  vllm/vllm-openai:latest \
  --model amkyawdev/amk-coder-v2

⚠️ Limitations

Context Length - Maximum 32,768 tokens
Code Quality - May generate incorrect code; verify outputs
Myanmar Unicode - Best results with proper Zawgyi-to-Unicode conversion
Domain Knowledge - Limited to common programming languages
Safety - May produce harmful content; use responsible AI practices

📖 Resources

🙏 Acknowledgments

Alibaba Cloud Qwen Team - Base model Qwen2.5-Coder
HuggingFace - Model hosting and infrastructure
Myanmar Developer Community - Testing and feedback

📝 License

Apache License 2.0 - See LICENSE file for details.

📧 Contact

Author: amkyawdev
HuggingFace: amkyawdev/amk-coder-v2
GitHub: github.com/amkyawdev

Made with ❤️ for Myanmar Developers