Instructions to use Anserwise/AWAXIS-KR-31B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Anserwise/AWAXIS-KR-31B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="Anserwise/AWAXIS-KR-31B")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("Anserwise/AWAXIS-KR-31B")
model = AutoModelForImageTextToText.from_pretrained("Anserwise/AWAXIS-KR-31B")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Anserwise/AWAXIS-KR-31B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Anserwise/AWAXIS-KR-31B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Anserwise/AWAXIS-KR-31B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/Anserwise/AWAXIS-KR-31B

SGLang

How to use Anserwise/AWAXIS-KR-31B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Anserwise/AWAXIS-KR-31B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Anserwise/AWAXIS-KR-31B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Anserwise/AWAXIS-KR-31B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Anserwise/AWAXIS-KR-31B",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use Anserwise/AWAXIS-KR-31B with Docker Model Runner:
```
docker model run hf.co/Anserwise/AWAXIS-KR-31B
```

AWAXIS-KR-31B

📌 모델 설명 (Description)

AWAXIS-KR-31B은 한국어 특화 MoE 베이스(JDONE-Research/AIOne-Agent-52B-A36B-it)에 Opus-distill 추론 시그널(Anserwise/AWAXIS-Think-31B)을 결합한 Darwin V8 FFN-crossbreed 파생 모델입니다. Gemma-4 MoE 아키텍처(8 전문가 top-2 라우팅, 52B 총 / 36B 활성 파라미터, vision/audio 토큰 지원)를 기반으로 한국어 instruction following, 지식·문화 QA, 단계별 추론·수학 작업에 최적화되어 있으며, 한국어 4과목 종합 80.0% 성능을 검증했습니다.

AWAXIS-KR-31B is a Darwin-derived Korean-focused MoE model (Gemma-4 family, 52B total / 36B active, 8 experts top-2 routing) built via Darwin V8 FFN-crossbreed. Optimized for Korean instruction following, knowledge & cultural QA, and reasoning. Architecture supports image-text inputs via the Gemma4 multimodal base.

🧬 모델 족보 (Model Lineage)

AWAXIS-KR-31B  (this model — Darwin-derived)
├── 어머니 Mother  (kept full)
│   └── JDONE-Research/AIOne-Agent-52B-A36B-it
│       — 한국어 특화 Gemma4 MoE 52B / A36B
│
└── 아버지 Father  (dense-FFN donor)
    └── Anserwise/AWAXIS-Think-31B
        ├── 조모 (kept full)
        │   └── TeichAI/gemma-4-31B-it-Claude-Opus-Distill-v2
        │       — Claude Opus 추론 distill 베이스
        │
        └── 조부 (FFN donor)
            └── google/gemma-4-31B-it
                — Gemma-4 베이스

직계 부모 (Direct parents)

역할	모델	기여
어머니 Mother (kept)	JDONE-Research/AIOne-Agent-52B-A36B-it	한국어 능력, MoE 라우팅, 전문가, 어텐션, 임베딩 100% 보존
아버지 Father (FFN donor)	Anserwise/AWAXIS-Think-31B	Opus-distill 추론 시그널을 dense FFN 경로로 주입

조부모 (Paternal grandparents)

역할	모델
조모 grandmother	TeichAI/gemma-4-31B-it-Claude-Opus-Distill-v2
조부 grandfather	google/gemma-4-31B-it

공통 시조 (Common ancestor): Google Gemma-4 아키텍처.

📚 활용 데이터셋 (Datasets Used)

본 모델의 한국어 능력 평가에는 K-AI Hub(NIA AI Hub) / K-AI Leaderboard(aihub.or.kr) 생태계의 표준 한국어 LLM 벤치마크 데이터셋을 활용했습니다.

데이터셋	분야	출처
KMMLU	한국어 지식 (45 과목)	HAERAE-HUB/KMMLU
HAE_RAE_BENCH_1.1	한국어 이해·문화 (13 서브셋)	HAERAE-HUB/HAE_RAE_BENCH_1.1
HRM8K	한국어 수학·추론 (GSM8K 한국어판)	HAERAE-HUB/HRM8K
CLIcK	한국어 문화-언어	EunsuKim/CLIcK

상기 데이터셋은 HAERAE-HUB와 EunsuKim 등 한국 연구 커뮤니티가 큐레이팅하여 K-AI 허브 평가 표준으로 채택된 공공 자산입니다.

🏗 아키텍처 (Architecture)


Class	`Gemma4ForConditionalGeneration` (multimodal: text + image + audio)
Parameters	52B total · 36B active (MoE, 8 experts, top-2 routing)
Layers	60
Hidden / Intermediate	5,376 / 21,504
Attention heads / head_dim	32 / 256
Vocab	262,144 (Gemma-4 tokenizer)
dtype	bfloat16

📊 측정 벤치마크 (Measured Benchmarks)

벤치마크	설정	점수
한국어 4과목 종합 · n=80, seed=42	greedy	80.0%
↳ KMMLU (지식)	20Q, greedy	70.0%
↳ HAERAE-Bench (이해)	20Q, greedy	75.0%
↳ HRM8K (수학)	20Q, greedy	90.0%
↳ CLIcK (문화언어)	20Q, greedy	85.0%
CLIcK (n=200)	greedy	88.0%

🎯 사용 용도 (Intended Use)

한국어 instruction following
지식·문화 QA, 추론·수학
일반 한국어 LLM 작업
멀티모달 입력(image-text-to-text)은 Gemma-4 베이스 능력 상속

🚀 추론 예시 (Inference)

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tok = AutoTokenizer.from_pretrained("Anserwise/AWAXIS-KR-31B", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    "Anserwise/AWAXIS-KR-31B",
    dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
    attn_implementation="eager",
)

msgs = [{"role": "user", "content": "한국의 외환위기 극복 과정을 단계별로 설명해 주세요."}]
text = tok.apply_chat_template(msgs, tokenize=False, add_generation_prompt=True)
inp = tok(text, return_tensors="pt").to(model.device)
out = model.generate(**inp, max_new_tokens=2048, do_sample=False)
print(tok.decode(out[0][inp["input_ids"].shape[-1]:], skip_special_tokens=True))