KRULL-Nano Simple

KRULL means Knowledge Running Under Lightweight Language.

KRULL-Nano is a lightweight decoder-only Small Language Model (SLM) architecture designed for embedded devices, edge AI, offline inference, and privacy-preserving applications.

Unlike cloud-oriented LLMs optimized for massive datacenters, KRULL-Nano is designed from the ground up for:

  • low latency
  • low memory usage
  • deterministic inference
  • offline execution
  • edge sovereignty
  • efficient deployment on constrained hardware

Architecture

KRULL-Nano uses a compact decoder-only transformer architecture with multiple optimizations for edge execution.

Core Components

Decoder-Only Transformer

  • autoregressive causal language modeling
  • GPT-style token prediction
  • sreaming-friendly generation

RMSNorm

KRULL replaces LayerNorm with RMSNorm to:

  • reduce computational overhead
  • improve low-precision stability
  • minimize memory bandwidth

Multi-Query Attention (MQA)

Instead of full multi-head attention:

  • multiple query heads
  • shared key/value heads

Benefits:

  • reduced KV cache size
  • faster inference
  • lower RAM usage

Gated Feed Forward Network

KRULL uses a gated FFN inspired by modern efficient transformer architectures.

Benefits:

  • improved parameter efficiency
  • lower compute cost
  • better expressivity per parameter

What is included

  • Decoder-only GPT-style model
  • RMSNorm
  • Multi-query attention
  • Gated feed-forward block
  • Simple character tokenizer
  • CPU training script
  • Text generation script
  • ONNX export script
  • Windows-friendly imports

Project structure

krull_nano_simple/
β”œβ”€β”€ krull/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ model.py
β”‚   └── tokenizer.py
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ train_tokenizer.py
β”‚   β”œβ”€β”€ train_lm.py
β”‚   β”œβ”€β”€ generate.py
β”‚   └── export_onnx.py
β”œβ”€β”€ configs/
β”‚   └── krull_nano.json
β”œβ”€β”€ data/
β”‚   └── tiny_corpus.txt
β”œβ”€β”€ artifacts/
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ LICENSE
└── README.md

Setup on Windows

Open PowerShell or CMD:

cd C:\workspace\krull_nano_simple
python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt

Setup on Linux/macOS

cd krull_nano_simple
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

1. Train tokenizer

python scripts/train_tokenizer.py --input data/tiny_corpus.txt --out artifacts/tokenizer.json

2. Train model

python scripts/train_lm.py --config configs/krull_nano.json --tokenizer artifacts/tokenizer.json --data data/tiny_corpus.txt --out artifacts/krull_nano.pt --epochs 10 --device cpu

3. Generate text

python scripts/generate.py --model artifacts/krull_nano.pt --tokenizer artifacts/tokenizer.json --prompt "KRULL is" --device cpu

4. Export ONNX

python scripts/export_onnx.py --model artifacts/krull_nano.pt --out artifacts/krull_nano.onnx

Notes

This repo is for learning and experimentation. The default dataset is tiny, so the generated text will not be intelligent. Replace data/tiny_corpus.txt with a larger corpus to train a better model.

License

MIT

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support