TorchSight Beam q8_0

Cybersecurity document classifier. LoRA fine-tune of Qwen 3.5 27B, quantized to q8_0. ~28 GB GGUF.

Recommended hardware: 48 GB GPU or 64 GB Mac.

Benchmark Results

Two benchmarks evaluated under identical methodology (alpaca prompt, Ollama /api/generate, Modelfile temperature 0.1, num_predict=2048):

Primary — eval-1000-synthetic (1000 stratified samples)

Model	Category Acc 95% CI	Subcategory Acc	Type
Beam q4_K_M	95.1% [93.8, 96.4]	48.5%	Local (LoRA)
Beam f16	93.0% [91.2, 94.5]	51.3%	Local (LoRA)
Beam q8_0	92.7% [90.9, 94.2]	51.3%	Local (LoRA)
Claude Sonnet 4	79.9%	23.0%	Commercial API
Claude Opus 4	79.9%	22.5%	Commercial API
GPT-5	76.9%	11.6%	Commercial API
Gemini 2.5 Pro	75.4%	21.0%	Commercial API
Regex baseline (49 patterns)	52.7%	—	Rule-based
Qwen 3.5 27B base (no LoRA)	43.3%	4.3%	Local

External — eval-500-external (500 held-out samples from real public datasets)

Held-out splits of training sources (NVD, NIST, AI4Privacy, Enron, phishing) plus MTSamples (medical transcriptions explicitly excluded from training).

Model	Category Acc 95% CI	Subcategory Acc	Δ vs. primary
Beam q4_K_M	93.8% [91.3, 95.6]	51.4%	−1.3 pp
Beam q8_0	91.2% [88.4, 93.4]	46.4%	−1.5 pp
Beam f16	91.0% [88.2, 93.2]	47.2%	−2.0 pp
Claude Sonnet 4	86.4%	—	+6.5 pp
Gemini 2.5 Pro	82.0%	—	+6.6 pp
GPT-5	65.8%	—	−11.1 pp
Regex baseline	29.6%	—	−23.1 pp
Qwen 3.5 27B base	28.0%	0%	−15.3 pp

Beam q4_K_M's gap over Claude Sonnet 4 is statistically significant (McNemar's χ²₁ = 126.7, p ≈ 2 × 10⁻²⁹), as is the gap over the unfine-tuned Qwen base (χ²₁ = 489.5, p ≈ 2 × 10⁻¹⁰⁸ — fine-tuning contributes +65.8 pp on external data with the identical prompt).

Usage with Ollama

# Pull from Ollama Hub
ollama pull torchsight/beam:q8_0

# Or build locally from this GGUF + Modelfile
ollama create torchsight/beam:q8_0 -f Modelfile

Modelfile:

FROM ./beam-1.0-q8_0.gguf
SYSTEM "You are TorchSight, a cybersecurity document classifier. Analyze the provided text and identify ALL security-relevant findings.

For each finding, output a JSON object with:
- category: one of [pii, credentials, financial, medical, confidential, malicious, safe]
- subcategory: specific type (e.g., pii.identity, malicious.injection, credentials.api_key)
- severity: one of [critical, high, medium, low, info]
- explanation: detailed explanation including specific values found.

If a document contains multiple types of sensitive data, return a finding for EACH one.
If the text is clean/safe, output a single finding with category \"safe\".

Respond ONLY with a JSON array of findings."
PARAMETER temperature 0.1
PARAMETER top_p 0.9
PARAMETER num_predict 2048

Reproducibility

Eval scripts and benchmark data: https://github.com/torchsight/torchsight/tree/main/beam/evaluation

git clone https://github.com/torchsight/torchsight
cd torchsight/beam/evaluation
BEAM_MODEL=torchsight/beam:q8_0 python scripts/eval_beam.py     # primary
BEAM_MODEL=torchsight/beam:q8_0 python scripts/eval_external.py # external

Citation

@misc{torchsight-beam-q8_0-2026,
  title  = {TorchSight Beam q8_0: cybersecurity document classifier},
  author = {Dobrovolskyi, Ivan},
  year   = {2026},
  url    = {https://huggingface.co/torchsight/beam-q8_0},
}

License

Apache 2.0

Downloads last month: 2

GGUF

Model size

27B params

Architecture

qwen35

Hardware compatibility

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for torchsight/beam-q8_0

Base model

Qwen/Qwen3.5-27B

Adapter

(79)

this model