Instructions to use HALION-AI/helionx-core-v1.5 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use HALION-AI/helionx-core-v1.5 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="HALION-AI/helionx-core-v1.5") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("HALION-AI/helionx-core-v1.5") model = AutoModelForCausalLM.from_pretrained("HALION-AI/helionx-core-v1.5") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use HALION-AI/helionx-core-v1.5 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "HALION-AI/helionx-core-v1.5" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HALION-AI/helionx-core-v1.5", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/HALION-AI/helionx-core-v1.5
- SGLang
How to use HALION-AI/helionx-core-v1.5 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "HALION-AI/helionx-core-v1.5" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HALION-AI/helionx-core-v1.5", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "HALION-AI/helionx-core-v1.5" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HALION-AI/helionx-core-v1.5", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use HALION-AI/helionx-core-v1.5 with Docker Model Runner:
docker model run hf.co/HALION-AI/helionx-core-v1.5
HelionX-Core v1.1
Proprietary Hosted-Only Cognitive & Defensive Intelligence System
⚠️ IMPORTANT — READ BEFORE USE
HelionX-Core is NOT a general-purpose chatbot.
This model is private, proprietary, and hosted-only.
- ❌ Weights are not licensed for public redistribution
- ❌ Local inference by third parties is not permitted
- ✅ Usage is restricted to HALION AI–controlled hosted environments
- ✅ Model access is enforced via Modal-hosted inference
This repository exists as a private artifact store for deployment infrastructure.
Overview
HelionX-Core v1.1 is a text-only Large Language Model designed for:
- Governed reasoning
- Defensive intelligence (non-operational)
- Explicit capability boundaries
- Auditability-first system design
The model is fine-tuned via LoRA on top of Qwen/Qwen2.5-7B, then fully merged into base weights and exported as safetensors.
There is no autonomy, no tool execution, and no self-learning.
Core Identity (Frozen)
- Project Name: HelionX-Core
- Organization: HALION AI
- Current Model Version: v1.1
- Architecture: Qwen2ForCausalLM
- Base Model: Qwen/Qwen2.5-7B
- Fine-tuning Method: LoRA → merged into base weights
- Modality: Text-only
HelionX-Core is a proprietary hosted intelligence system, not an assistant product.
Licensing & Access Model (Final)
- License: Proprietary (see
LICENSEfile) - Hugging Face metadata: Marked as
otherdue to HF UI constraints - Enforcement mechanisms:
- Private Hugging Face repository
- Modal-hosted inference only
- Custom proprietary license
- No public API keys
- No public downloads
Comparable access model:
- OpenAI (GPT)
- Anthropic (Claude)
- Cohere (hosted variants)
- ElevenLabs
Platforms Used (Frozen)
Hugging Face (Private)
- Purpose: Model artifact storage ONLY
- Repo:
HALION-AI/helionx-core-v1.1 - Contains:
- Model weights
- Tokenizer
- Config files
- Metadata
- ❌ Not used for public inference
Modal.com
- Purpose: Runtime execution & GPU inference
- GPUs: A10G / A100 (40GB)
- Handles:
- Model loading
- Inference execution
- Warm-start optimization
- Future API exposure
Local Machine
- Purpose:
- Repo management
- Writing Modal scripts
- Uploading to Hugging Face
- ❌ No inference or training
Repository Contents
This Hugging Face repository contains a fully valid Transformers model.
Model Weights
model-00001-of-00004.safetensorsmodel-00002-of-00004.safetensorsmodel-00003-of-00004.safetensorsmodel-00004-of-00004.safetensorsmodel.safetensors.index.json
Configuration
config.json(Qwen2-compatible, fixed)generation_config.json
Tokenizer
Important: Tokenizer is inherited directly from Qwen2.5-7B.
LoRA fine-tuning does not modify tokenizer files.
tokenizer.jsontokenizer_config.jsonvocab.jsonmerges.txt
Metadata
README.mdLICENSE(custom proprietary).gitattributes.gitignore
v1.5 — Stabilization Layer (System-Level, Not Weights)
The following features belong to v1.5 stabilization, implemented at the inference / product layer, not inside the model weights:
- Warm-start model loading
- Reduced cold-start latency
- Anchored system prompt
- Authentication
- Request limits
- Basic prompt abuse prevention
- Logging hooks
- HTTPS API exposure via Modal
- Session-based conversation memory
- Basic admin statistics
- Usage analytics
- Web chat interface (separate layer)
These features do NOT require re-training or re-uploading weights.
They are enforced by runtime, orchestration, and API design.
Intended Use
HelionX-Core is intended for:
- Research
- Defensive cybersecurity reasoning (non-operational)
- Architecture study
- Controlled, audited deployments
It is not intended for:
- Autonomous action
- Tool execution
- Offensive security
- Multimodal tasks
- Consumer chatbot usage
Explicit Limitations
- Text-only
- No internet access
- No tools
- No self-learning
- No autonomy
- No multimodal input
- No emotional role-play
- No hidden capabilities
The system explicitly signals uncertainty and limitations when applicable.
Roadmap Status
- v1.1: ✅ Complete (current model)
- v1.5: ✅ Stabilization in progress (system layer)
- v2.0: Audio pipeline (separate module)
- v2.5+: Multimodal (future, separate training)
The roadmap is frozen and non-negotiable.
Legal Notice
© 2026 HALION AI
All rights reserved.
Use of this model is governed exclusively by the terms in the LICENSE file.
Unauthorized redistribution, modification, or public deployment is prohibited.
- Downloads last month
- 3