Q-Office-Suite Runtime — local HTTP server hosting all 9 sovereign specialists

Qovaryx — sovereign options-decoder AI

This model is part of Qovaryx, a 11-head sovereign AI cluster that grades options trades in under a millisecond on CPU.

Download the desktop beta: https://qovaryx.jehorizon.com/download.html

Read the research: https://qovaryx.jehorizon.com/research

Main site: https://qovaryx.jehorizon.com

Not financial advice. Options trading carries substantial risk.

📦 Shipped inside the Qovaryx app

This is a component of the Qovaryx Options Decoder cluster. It is published here for transparency + research reproducibility — the runtime is bundled in the desktop app, not installed from Hugging Face. Installer links have been removed from this card.

👉 Download the signed beta: https://qovaryx.jehorizon.com/download.html 📖 Read the research: https://qovaryx.jehorizon.com/research

This repo ships the Q-Office-Suite runtime: a standalone Nuitka-compiled Windows binary that hosts the nine sovereign Qovaryx specialists behind a local HTTP API. CPU inference. No GPU required. No internet required after download.

All nine specialist weights are published openly under Apache 2.0 at their per-model cards. The runtime entrypoint and dispatch are Qovaryx proprietary technology — same posture as the options decoder runtime: weights and audit are open; entrypoint and recipe stay private.

The nine specialists hosted

All nine are 53.5M-parameter full-finetunes from tjarvis91/qovaryx-50m-scratch-base. No SmolLM2. No Qwen. No Llama. No borrowed foundation weights.

Specialist	Job	Score
Q-Triage	Support ticket routing	100% (60/60)
Q-DocCite	Document citation w/ page anchor	100% (60/60)
Q-Invoice	Invoice JSON extractor	100% (60/60)
Q-ToolCall	Agent tool-call JSON	100% (60/60)
Q-Meeting	Meeting note structurer	100% (60/60)
Q-FinCite	10-K/10-Q citation	100% (60/60)
Q-CmdSafe	Shell command safety triage	100% (60/60)
Q-SheetExtract	Spreadsheet field extractor	100% (37/37)
Q-Coder	Python code one-liners + skeletons	100% (53/53)

What's in this repo

~1.98 GB. Bundles Python 3.10 + PyTorch CPU + tokenizers + the cluster shell. No installer; just run.

README.md — this file.

What you need to provide

The runtime ships the dispatch layer; the 9 specialist weights are downloaded separately from their HF cards. Layout the runtime expects:

<your-dir>/

  tokenizer.json                  # from any of the 9 specialist repos (they share)
  weights/
    q-triage-50m-v2/final.pt      # from tjarvis91/Q-Triage-50M-Sovereign
    q-doccite-50m-v2/final.pt     # from tjarvis91/Q-DocCite-50M-Sovereign
    q-docextract-50m-v1/final.pt  # from tjarvis91/Q-Invoice-50M-Sovereign
    q-toolcall-50m-v1/final.pt    # from tjarvis91/Q-ToolCall-50M-Sovereign
    q-meeting-50m-v1/final.pt     # from tjarvis91/Q-Meeting-50M-Sovereign
    q-fincite-50m-v1/final.pt     # from tjarvis91/Q-FinCite-50M-Sovereign
    q-devsafe-50m-v1/final.pt     # from tjarvis91/Q-CmdSafe-50M-Sovereign
    q-sheetextract-50m-v4/final.pt # from tjarvis91/Q-SheetExtract-50M-Sovereign
    q-coder-50m-v2/final.pt       # from tjarvis91/Q-Coder-50M-Sovereign

Total disk for all 9 weights: ~3 GB. Total RAM at idle: ~1 GB. RAM per active specialist: ~250 MB.

How to run

$env:Q_OFFICE_WEIGHTS_DIR = "C:\path\to\weights"
$env:Q_OFFICE_TOKENIZER   = "C:\path\to\tokenizer.json"
$env:Q_OFFICE_HOST        = "127.0.0.1"
$env:Q_OFFICE_PORT        = "8788"

First launch performs Nuitka onefile self-extraction (~30 s for the 2 GB payload to %TEMP%). Subsequent launches reuse the extracted cache.

Once up, the runtime listens on http://127.0.0.1:8788:

Endpoints

GET /health — {ok, loaded}
GET /specialists — list of specialist keys + descriptions
POST /ask {text, [system], [max_new]} — route + run; returns the dispatch decision + output
POST /run/<key> {text, [system], [max_new]} — force-route to a specific specialist

Example: Q-Triage via /ask

curl -X POST http://127.0.0.1:8788/ask \
  -H "Content-Type: application/json" \
  -d '{"text":"Triage. Return JSON {category, priority}.\nSubject: 502 errors since 14:00 deploy"}'

Response:

{
  "specialist": "q-triage",
  "route_reason": "matched 2/2 cues",
  "route_confidence": 1.0,
  "output": "{\"category\": \"incident/sev2\", \"priority\": \"high\"}"
}

Example: Q-Coder via /run

curl -X POST http://127.0.0.1:8788/run/q-coder \
  -H "Content-Type: application/json" \
  -d '{"text":"Define a function square that returns x squared."}'

Response:

{
  "specialist": "q-coder",
  "output": "def square(x):\n    return x * x"
}

What this is NOT

Not a chatbot frontend. This is an HTTP backend for embedding in other applications. Bring your own UI.
Not a Linux/macOS binary. This release is Windows only. Source-tree Python invocation works cross-platform — see the per-specialist cards.
Not a GPU runtime. CPU only by design. The full suite runs in ~1 GB RAM with sub-second latency per call on a modern laptop.
Not a replacement for a verifier. This is the dispatch layer. The decision-acceptance discipline lives upstream / downstream.

License & posture

The weights (each specialist's pytorch_model.pt) are Apache 2.0 at their per-model HF cards.

The Q-Office-Suite runtime entrypoint, the cluster shell routing policy, the crystal corpora, the eval gate constants, and the training pipeline are Qovaryx proprietary technology and are not included in this release.

This is the same posture as every previous Qovaryx public release: ship the weights and the audit, not the recipe.

Watermark

The binary is Nuitka-compiled; the dispatch layer is not source-readable without reverse engineering. SHA256 fingerprint is in

Community & support

Research devlog: https://github.com/thron-j/qovaryx-ai-research
Discord (Qovaryx community): https://discord.gg/PtuHZDv5ju
Ko-fi (we cover GPU bills): https://ko-fi.com/tjarvis91
Qovaryx runtime (official site) (sibling release): https://qovaryx.jehorizon.com

If you find a routing-decision failure mode the readme doesn't cover, open a discussion here or come to the Discord.

Downloads last month: 343