Instructions to use HALION-AI/helionx-core-v1.5 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use HALION-AI/helionx-core-v1.5 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="HALION-AI/helionx-core-v1.5")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("HALION-AI/helionx-core-v1.5")
model = AutoModelForCausalLM.from_pretrained("HALION-AI/helionx-core-v1.5")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use HALION-AI/helionx-core-v1.5 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "HALION-AI/helionx-core-v1.5"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "HALION-AI/helionx-core-v1.5",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/HALION-AI/helionx-core-v1.5

SGLang

How to use HALION-AI/helionx-core-v1.5 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "HALION-AI/helionx-core-v1.5" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "HALION-AI/helionx-core-v1.5",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "HALION-AI/helionx-core-v1.5" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "HALION-AI/helionx-core-v1.5",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use HALION-AI/helionx-core-v1.5 with Docker Model Runner:
```
docker model run hf.co/HALION-AI/helionx-core-v1.5
```

HelionX-Core v1.1

Proprietary Hosted-Only Cognitive & Defensive Intelligence System

⚠️ IMPORTANT — READ BEFORE USE

HelionX-Core is NOT a general-purpose chatbot.
This model is private, proprietary, and hosted-only.

❌ Weights are not licensed for public redistribution
❌ Local inference by third parties is not permitted
✅ Usage is restricted to HALION AI–controlled hosted environments
✅ Model access is enforced via Modal-hosted inference

This repository exists as a private artifact store for deployment infrastructure.

Overview

HelionX-Core v1.1 is a text-only Large Language Model designed for:

Governed reasoning
Defensive intelligence (non-operational)
Explicit capability boundaries
Auditability-first system design

The model is fine-tuned via LoRA on top of Qwen/Qwen2.5-7B, then fully merged into base weights and exported as safetensors.

There is no autonomy, no tool execution, and no self-learning.

Core Identity (Frozen)

Project Name: HelionX-Core
Organization: HALION AI
Current Model Version: v1.1
Architecture: Qwen2ForCausalLM
Base Model: Qwen/Qwen2.5-7B
Fine-tuning Method: LoRA → merged into base weights
Modality: Text-only

HelionX-Core is a proprietary hosted intelligence system, not an assistant product.

Licensing & Access Model (Final)

License: Proprietary (see LICENSE file)
Hugging Face metadata: Marked as other due to HF UI constraints
Enforcement mechanisms:
- Private Hugging Face repository
- Modal-hosted inference only
- Custom proprietary license
- No public API keys
- No public downloads

Comparable access model:

OpenAI (GPT)
Anthropic (Claude)
Cohere (hosted variants)
ElevenLabs

Platforms Used (Frozen)

Hugging Face (Private)

Purpose: Model artifact storage ONLY
Repo: HALION-AI/helionx-core-v1.1
Contains:
- Model weights
- Tokenizer
- Config files
- Metadata
❌ Not used for public inference

Modal.com

Purpose: Runtime execution & GPU inference
GPUs: A10G / A100 (40GB)
Handles:
- Model loading
- Inference execution
- Warm-start optimization
- Future API exposure

Local Machine

Purpose:
- Repo management
- Writing Modal scripts
- Uploading to Hugging Face
❌ No inference or training

Repository Contents

This Hugging Face repository contains a fully valid Transformers model.

Model Weights

model-00001-of-00004.safetensors
model-00002-of-00004.safetensors
model-00003-of-00004.safetensors
model-00004-of-00004.safetensors
model.safetensors.index.json

Configuration

config.json (Qwen2-compatible, fixed)
generation_config.json

Tokenizer

Important: Tokenizer is inherited directly from Qwen2.5-7B.
LoRA fine-tuning does not modify tokenizer files.

tokenizer.json
tokenizer_config.json
vocab.json
merges.txt

Metadata

README.md
LICENSE (custom proprietary)
.gitattributes
.gitignore

v1.5 — Stabilization Layer (System-Level, Not Weights)

The following features belong to v1.5 stabilization, implemented at the inference / product layer, not inside the model weights:

Warm-start model loading
Reduced cold-start latency
Anchored system prompt
Authentication
Request limits
Basic prompt abuse prevention
Logging hooks
HTTPS API exposure via Modal
Session-based conversation memory
Basic admin statistics
Usage analytics
Web chat interface (separate layer)

These features do NOT require re-training or re-uploading weights.
They are enforced by runtime, orchestration, and API design.

Intended Use

HelionX-Core is intended for:

Research
Defensive cybersecurity reasoning (non-operational)
Architecture study
Controlled, audited deployments

It is not intended for:

Autonomous action
Tool execution
Offensive security
Multimodal tasks
Consumer chatbot usage

Explicit Limitations

Text-only
No internet access
No tools
No self-learning
No autonomy
No multimodal input
No emotional role-play
No hidden capabilities

The system explicitly signals uncertainty and limitations when applicable.

Roadmap Status

v1.1: ✅ Complete (current model)
v1.5: ✅ Stabilization in progress (system layer)
v2.0: Audio pipeline (separate module)
v2.5+: Multimodal (future, separate training)

The roadmap is frozen and non-negotiable.

Legal Notice

Use of this model is governed exclusively by the terms in the LICENSE file.
Unauthorized redistribution, modification, or public deployment is prohibited.

Downloads last month: 3

Safetensors

Model size

8B params

Tensor type

BF16