Instructions to use LumiVore/lumivore-1.2b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use LumiVore/lumivore-1.2b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="LumiVore/lumivore-1.2b") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("LumiVore/lumivore-1.2b") model = AutoModelForCausalLM.from_pretrained("LumiVore/lumivore-1.2b") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use LumiVore/lumivore-1.2b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "LumiVore/lumivore-1.2b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "LumiVore/lumivore-1.2b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/LumiVore/lumivore-1.2b
- SGLang
How to use LumiVore/lumivore-1.2b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "LumiVore/lumivore-1.2b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "LumiVore/lumivore-1.2b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "LumiVore/lumivore-1.2b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "LumiVore/lumivore-1.2b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use LumiVore/lumivore-1.2b with Docker Model Runner:
docker model run hf.co/LumiVore/lumivore-1.2b
LumiVore-1.2B
LumiVore-1.2B is a Mixture-of-Experts (MoE) language model fine-tuned for agentic workflows and conversational AI. Trained entirely on consumer hardware (AMD RX 7600 XT 16GB), it demonstrates that capable language models can be developed without datacenter-scale resources.
Model Details
| Attribute | Value |
|---|---|
| Architecture | Mixture-of-Experts (DeepSeek-MoE style) |
| Base Model | Qwen2.5-0.5B-Instruct |
| Total Parameters | 1.36B |
| Active Parameters | ~610M per token (top-2 routing) |
| Experts | 8 (1 shared + 7 routed) |
| MoE Layers | 8 of 24 transformer layers |
| Context Length | 2048 tokens |
| Precision | bfloat16 |
Architecture
LumiVore-1.2B uses a Mixture-of-Experts architecture with:
- 8 experts total: 1 shared expert always active + 7 routed experts
- Top-2 routing: For each token, the router selects 2 experts (1 shared + 1 routed)
- Sparse activation: Only ~610M parameters are active per token despite 1.36B total
- Load balancing: Auxiliary losses ensure even expert utilization
This design provides the capacity of a larger model with the inference cost of a smaller one.
Training
Stage 1: Capability Building
- Dataset: TerminalTrajectories + OpenThoughts (~11,600 examples)
- Method: Full fine-tuning with LoRA on routing layers
- Duration: ~5.4 hours
- Goal: General agent capabilities, tool use, reasoning
Stage 2: Domain Adaptation
- Dataset: OpenClaw agent-specific data (~11,900 examples)
- Method: LoRA fine-tuning (rank=64, attention + routing)
- Duration: ~5 hours
- Goal: OpenClaw ecosystem specialization
Hardware
- GPU: AMD RX 7600 XT (16GB VRAM)
- Framework: PyTorch with ROCm
- Optimizer: 8-bit AdamW
- Total Training Time: ~10 hours
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "LumiVore/lumivore-1.2b"
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
prompt = "You are a helpful AI assistant.
User: Hello!
Assistant:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.7, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Limitations
- Small base: Built on Qwen2.5-0.5B — foundational limitations apply
- Training scale: 23K examples vs. millions for production models
- Identity: May occasionally claim to be other models (GPT-4, Qwen, etc.)
- Verbosity: Can be verbose; use system prompts to guide conciseness
- No RLHF: No reinforcement learning from human feedback
Evaluation
This model prioritizes:
- ✅ Agentic tool use — calling functions, following patterns
- ✅ Structured outputs — JSON, markdown, code
- ✅ Conversational flow — turn-taking, context tracking
- ⚠️ Creative writing — not a primary training objective
- ❌ Factual knowledge — limited by base model size
Resources
| Resource | Link |
|---|---|
| GitHub | https://github.com/dansan-claw/lumivore |
| Website | https://lumivore.ai |
| Discord | https://discord.gg/M7U8JCUukD |
| Datasets | See LumiVore organization |
Citation
@misc{lumivore-1.2b,
title={LumiVore-1.2B: A Mixture-of-Experts Model for Agentic AI},
author={van Eek, Daniel},
year={2026},
url={https://huggingface.co/LumiVore/lumivore-1.2b}
}
License
Apache 2.0 — use it, modify it, ship it in your products.
LumiVore AI explores the future of intelligent systems — building AI that is efficient, adaptable, and accessible.
- Downloads last month
- 5