Q-Office-Suite Runtime β local HTTP server hosting all 9 sovereign specialists
Qovaryx β sovereign options-decoder AI
This model is part of Qovaryx, a 11-head sovereign AI cluster that grades options trades in under a millisecond on CPU.
- Download the desktop beta: https://qovaryx.jehorizon.com/download.html
- Read the research: https://qovaryx.jehorizon.com/research
- Main site: https://qovaryx.jehorizon.com
Not financial advice. Options trading carries substantial risk.
π¦ Shipped inside the Qovaryx app
This is a component of the Qovaryx Options Decoder cluster. It is published here for transparency + research reproducibility β the runtime is bundled in the desktop app, not installed from Hugging Face. Installer links have been removed from this card.
π Download the signed beta: https://qovaryx.jehorizon.com/download.html π Read the research: https://qovaryx.jehorizon.com/research
This repo ships the Q-Office-Suite runtime: a standalone Nuitka-compiled Windows binary that hosts the nine sovereign Qovaryx specialists behind a local HTTP API. CPU inference. No GPU required. No internet required after download.
All nine specialist weights are published openly under Apache 2.0 at their per-model cards. The runtime entrypoint and dispatch are Qovaryx proprietary technology β same posture as the options decoder runtime: weights and audit are open; entrypoint and recipe stay private.
The nine specialists hosted
All nine are 53.5M-parameter full-finetunes from
tjarvis91/qovaryx-50m-scratch-base.
No SmolLM2. No Qwen. No Llama. No borrowed foundation weights.
| Specialist | Job | Score |
|---|---|---|
| Q-Triage | Support ticket routing | 100% (60/60) |
| Q-DocCite | Document citation w/ page anchor | 100% (60/60) |
| Q-Invoice | Invoice JSON extractor | 100% (60/60) |
| Q-ToolCall | Agent tool-call JSON | 100% (60/60) |
| Q-Meeting | Meeting note structurer | 100% (60/60) |
| Q-FinCite | 10-K/10-Q citation | 100% (60/60) |
| Q-CmdSafe | Shell command safety triage | 100% (60/60) |
| Q-SheetExtract | Spreadsheet field extractor | 100% (37/37) |
| Q-Coder | Python code one-liners + skeletons | 100% (53/53) |
What's in this repo
~1.98 GB. Bundles Python 3.10 + PyTorch CPU + tokenizers + the cluster shell. No installer; just run.
README.mdβ this file.
What you need to provide
The runtime ships the dispatch layer; the 9 specialist weights are downloaded separately from their HF cards. Layout the runtime expects:
<your-dir>/
tokenizer.json # from any of the 9 specialist repos (they share)
weights/
q-triage-50m-v2/final.pt # from tjarvis91/Q-Triage-50M-Sovereign
q-doccite-50m-v2/final.pt # from tjarvis91/Q-DocCite-50M-Sovereign
q-docextract-50m-v1/final.pt # from tjarvis91/Q-Invoice-50M-Sovereign
q-toolcall-50m-v1/final.pt # from tjarvis91/Q-ToolCall-50M-Sovereign
q-meeting-50m-v1/final.pt # from tjarvis91/Q-Meeting-50M-Sovereign
q-fincite-50m-v1/final.pt # from tjarvis91/Q-FinCite-50M-Sovereign
q-devsafe-50m-v1/final.pt # from tjarvis91/Q-CmdSafe-50M-Sovereign
q-sheetextract-50m-v4/final.pt # from tjarvis91/Q-SheetExtract-50M-Sovereign
q-coder-50m-v2/final.pt # from tjarvis91/Q-Coder-50M-Sovereign
Total disk for all 9 weights: ~3 GB. Total RAM at idle: ~1 GB. RAM per active specialist: ~250 MB.
How to run
$env:Q_OFFICE_WEIGHTS_DIR = "C:\path\to\weights"
$env:Q_OFFICE_TOKENIZER = "C:\path\to\tokenizer.json"
$env:Q_OFFICE_HOST = "127.0.0.1"
$env:Q_OFFICE_PORT = "8788"
First launch performs Nuitka onefile self-extraction (~30 s for the 2 GB
payload to %TEMP%). Subsequent launches reuse the extracted cache.
Once up, the runtime listens on http://127.0.0.1:8788:
Endpoints
GET /healthβ{ok, loaded}GET /specialistsβ list of specialist keys + descriptionsPOST /ask {text, [system], [max_new]}β route + run; returns the dispatch decision + outputPOST /run/<key> {text, [system], [max_new]}β force-route to a specific specialist
Example: Q-Triage via /ask
curl -X POST http://127.0.0.1:8788/ask \
-H "Content-Type: application/json" \
-d '{"text":"Triage. Return JSON {category, priority}.\nSubject: 502 errors since 14:00 deploy"}'
Response:
{
"specialist": "q-triage",
"route_reason": "matched 2/2 cues",
"route_confidence": 1.0,
"output": "{\"category\": \"incident/sev2\", \"priority\": \"high\"}"
}
Example: Q-Coder via /run
curl -X POST http://127.0.0.1:8788/run/q-coder \
-H "Content-Type: application/json" \
-d '{"text":"Define a function square that returns x squared."}'
Response:
{
"specialist": "q-coder",
"output": "def square(x):\n return x * x"
}
What this is NOT
- Not a chatbot frontend. This is an HTTP backend for embedding in other applications. Bring your own UI.
- Not a Linux/macOS binary. This release is Windows only. Source-tree Python invocation works cross-platform β see the per-specialist cards.
- Not a GPU runtime. CPU only by design. The full suite runs in ~1 GB RAM with sub-second latency per call on a modern laptop.
- Not a replacement for a verifier. This is the dispatch layer. The decision-acceptance discipline lives upstream / downstream.
License & posture
The weights (each specialist's pytorch_model.pt) are Apache 2.0 at their
per-model HF cards.
The Q-Office-Suite runtime entrypoint, the cluster shell routing policy, the crystal corpora, the eval gate constants, and the training pipeline are Qovaryx proprietary technology and are not included in this release.
This is the same posture as every previous Qovaryx public release: ship the weights and the audit, not the recipe.
Watermark
The binary is Nuitka-compiled; the dispatch layer is not source-readable without reverse engineering. SHA256 fingerprint is in
Community & support
- Research devlog: https://github.com/thron-j/qovaryx-ai-research
- Discord (Qovaryx community): https://discord.gg/PtuHZDv5ju
- Ko-fi (we cover GPU bills): https://ko-fi.com/tjarvis91
- Qovaryx runtime (official site) (sibling release): https://qovaryx.jehorizon.com
If you find a routing-decision failure mode the readme doesn't cover, open a discussion here or come to the Discord.
- Downloads last month
- 343