Wind Edge 1.6 (SFT)

Compact dense decoder-only transformer for edge inference.

Architecture

Wind Edge is a custom dense transformer (RMSNorm + RoPE + GQA + SwiGLU) with per-head Q/K normalization. Includes its own WindEdgeForCausalLM / WindEdgeConfig classes — load with trust_remote_code=True.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
m = AutoModelForCausalLM.from_pretrained("arthu1/wind-edge-1.6-sft", trust_remote_code=True)
tok = AutoTokenizer.from_pretrained("arthu1/wind-edge-1.6-sft")

Pipeline

  1. Continued pretraining on GLM reasoning, CodeX thinking, and Nemotron post-training data.
  2. Supervised fine-tuning on Wind Edge SFT, identity pairs, Claude-style assistant data, and safety/chat rows.
  3. Depth pruning (block-influence) to ~0.45B parameters.
Downloads last month
94
Safetensors
Model size
0.4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for North-ML1/Wind-Edge-1.6-Base

Finetunes
1 model

Collection including North-ML1/Wind-Edge-1.6-Base