A newer version of this model is available: QuantaSparkLabs/NYXIS-Pro

NYXIS-1.1B — Identity-Aligned Lightweight Language Model by QuantaSparkLabs

All New NYXIS 2B!

This repository contains the fully merged model weights (not just LoRA adapters),
compatible with 🤗 Transformers, vLLM, Text Generation Inference, Unsloth, and custom pipelines. Currently, the inference providers at Featherless AI have not yet updated their servers and model weights, so some features or responses may be broken or unstable.


📋 Overview

NYXIS-1.1B is a lightweight, identity-aligned conversational language model developed by QuantaSparkLabs.
It is fine-tuned from Qwen2.5-1.5B-Instruct using QLoRA + Unsloth on a custom curated dataset — built entirely on a T4 GPU.

NYXIS is designed for stable persona consistency, instruction following, web-search tool calling, and efficient edge deployment — all while keeping a tiny VRAM footprint.


🎯 Design Goals

🎯 Goal 📌 Detail
🪪 Identity Alignment Consistent "I'm NYXIS, created by QuantaSparkLabs" across all contexts
🌐 Tool Calling Trained web-search function-call pattern built in
⚡ Efficiency Runs on T4 / 8GB VRAM without quantization tricks
🔧 Plug & Play Fully merged weights — no adapter loading needed
🧠 Knowledge Retention Custom dataset preserves Qwen2.5 base knowledge

✨ Core Capabilities

Capability Description
🧠 Conversational AI Chat-optimized with Qwen2.5 <|im_start|> / <|im_end|> template
🪪 Identity Alignment Consistent "NYXIS by QuantaSparkLabs" persona under all prompts
📚 Instruction Following Supports reasoning, explanation, summarization, and coding
🌐 Web Search Tool Emits web_search(query) function calls when external info is needed
Lightweight Runs on 6–8 GB VRAM in FP16
🔧 Fully Merged Weights Standalone model — no LoRA adapter required at runtime

🏗️ Model Architecture

🔩 Base Model

Field Value
Backbone Qwen/Qwen2.5-1.5B-Instruct
Framework Hugging Face Transformers + Unsloth
Fine-tuning QLoRA (rank 16) → Full Weight Merge
Chat Template Qwen2.5 ChatML (<|im_start|> / <|im_end|>)

🔄 Training Pipeline

Qwen2.5-1.5B-Instruct (Base)
        ↓
  QLoRA Fine-Tuning
  (rank 16, Unsloth)
        ↓
  Custom 500-example
  Identity + Chat + Tool Dataset
        ↓
  Full Weight Merge
  (adapter baked into model)
        ↓
  NYXIS-1.1B — Deployed on HuggingFace 🚀

📊 Technical Specifications

⚙️ Parameter 📌 Value
Model Name NYXIS-1.1B
Organization QuantaSparkLabs
Base Model Qwen/Qwen2.5-1.5B-Instruct
Total Parameters ~1.56 Billion
Trainable Parameters 18.5M (1.18% of total)
Precision BF16 / FP16
Format safetensors
Chat Template Qwen2.5 ChatML (Jinja)
Inference Mode Causal LM
File Size ~2.0–2.2 GB

🧬 Training Details

⚡ Fine-Tuning Method

🔬 Setting 📌 Value
Technique QLoRA (Quantized Low-Rank Adaptation)
Library Unsloth
LoRA Rank 16
Optimizer AdamW (paged)
Learning Rate 2e-4
Epochs 3
Total Steps 189
Batch Size 8 (2 per device × 4 grad accumulation)
Hardware T4 GPU
Final Training Loss ~0.08 ✅
Merge Strategy Full weight merge — adapter baked in

📂 Dataset Composition (500 examples)

🗂️ Category 📊 Proportion 📝 Description
🪪 Identity 10% (50 examples) Gives its Identity
💬 Open Chat 70% (350 examples) Diverse assistant responses — science, jokes, coding, daily life, etc.
🌐 Web Search Tool 20% (100 examples) Function-calling pattern: model requests web_search(query) when it needs external info

The dataset was custom-built to preserve Qwen2.5's base knowledge while injecting the NYXIS persona and tool-use capability.


💻 Quick Start

🔧 Installation

# Option A: Standard Transformers
pip install transformers accelerate torch

# Option B: Unsloth (recommended for speed + memory efficiency)
pip install unsloth

🚀 Load & Chat — Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

MODEL_ID = "QuantaSparkLabs/NYXIS-1.1B"

tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    torch_dtype=torch.float16,
    device_map="auto"
)
model.eval()

messages = [
    {"role": "system", "content": "You are NYXIS, a helpful AI created by QuantaSparkLabs."},
    {"role": "user", "content": "Hello NYXIS! Who are you?"}
]

prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=150,
        temperature=0.6,
        top_p=0.9,
        repetition_penalty=1.15,
        no_repeat_ngram_size=3,
        pad_token_id=tokenizer.eos_token_id,
        eos_token_id=tokenizer.eos_token_id
    )

response = tokenizer.decode(
    outputs[0][inputs["input_ids"].shape[1]:],
    skip_special_tokens=True
)
print("NYXIS:", response)

⚡ Load with Unsloth (Recommended)

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="QuantaSparkLabs/NYXIS-1.1B",
    max_seq_length=2048,
    load_in_4bit=True,
)
FastLanguageModel.for_inference(model)

🖊️ Manual Qwen2.5 Chat Prompt Format

NYXIS uses the standard Qwen2.5 ChatML tokens. Build your prompt manually like this:

messages = [
    {"role": "system", "content": "You are NYXIS, a helpful AI created by QuantaSparkLabs."},
    {"role": "user", "content": "What is a black hole?"}
]

prompt = ""
for msg in messages:
    prompt += f"<|im_start|>{msg['role']}\n{msg['content']}<|im_end|>\n"
prompt += "<|im_start|>assistant\n"

Then tokenize and generate normally.


🌐 Web Search Tool Pattern

When a system prompt mentions that a web_search tool is available, NYXIS may emit a function call instead of answering directly:

<|im_start|>assistant
[{"type": "function", "function": {"name": "web_search", "arguments": {"query": "latest news on AI"}}}]
<|im_end|>

You can intercept this, run an actual search, and feed the result back as a tool message to get the final answer.

⚠️ The web-search pattern is trained behaviour only — it does not include a live search engine.
You need to implement the tool runner yourself (e.g. using SerpAPI, DuckDuckGo, or Tavily).


⚡ Hardware Requirements

🖥️ Hardware 🚦 Performance
T4 GPU (16GB) Optimal — trained on this
RTX 3060 (12GB) Smooth FP16
8GB VRAM GPU ⚠️ Usable — FP16 recommended
4GB VRAM GPU 🔶 Use 4-bit via Unsloth / BitsAndBytes
CPU Only 🐌 Slow but functional

📁 Repository Structure

NYXIS-1.1B/
├── model.safetensors        # Full merged weights (~2.2 GB)
├── config.json              # Model architecture config
├── tokenizer.json           # Qwen2.5 tokenizer
├── tokenizer_config.json    # Chat template config
├── generation_config.json   # Default generation settings
├── chat_template.jinja      # Jinja chat template
└── README.md

⚠️ Known Limitations

⚠️ Issue 📝 Notes
🔁 Hallucination May occasionally hallucinate or oversimplify (1.5B scale)
🗣️ Identity Bias May append "How can I help you today?" — reduce via system prompt tuning
🔢 Math Reasoning Limited complex math ability (small model)
🌍 Language Primarily English-focused
🚫 Critical Use Not suitable for medical, legal, or safety-critical applications
🔍 Web Search Tool pattern only — no live search engine included

🔒 Safety & Alignment

NYXIS is trained with:

  • ✅ Identity alignment dataset (consistent persona)
  • ✅ Instruction-balanced samples (diverse and safe)
  • ✅ Controlled decoding configuration (anti-loop)

Recommended generation settings:

temperature = 0.6
top_p = 0.9
repetition_penalty = 1.1  # to 1.2
no_repeat_ngram_size = 3

🚀 Version History

🏷️ Version 📅 Date 📝 Notes
v1.0 Early 2025 Initial LoRA fine-tune on TinyLlama
v1.1 (NYXIS 2.1) 2025 Rebuilt on Qwen2.5-1.5B-Instruct · QLoRA · Unsloth · 500 examples · Web-search tool · Full merge · HF deployment

📜 License

This model is licensed under the Apache 2.0 License,
following the original Qwen2.5-1.5B-Instruct license terms.


NYXIS • Built by QuantaSparkLabs • 2025–2026
Lightweight • Identity-Aligned • Efficient • Open Source

If you find NYXIS useful, give the repo a ❤️ and share your creations!

Downloads last month
379
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
Input a message to start chatting with QuantaSparkLabs/NYXIS-1.1B.

Model tree for QuantaSparkLabs/NYXIS-1.1B

Finetuned
(1583)
this model
Finetunes
1 model
Quantizations
2 models

Space using QuantaSparkLabs/NYXIS-1.1B 1