Cybersecurirty Assistant β€” Cybersecurity Reasoning Adapter (Phi-3.5 Mini, LoRA)

Model Summary

Cybersecurirty Assistant is a LoRA fine-tuned adapter for microsoft/Phi-3.5-mini-instruct (3.8B parameters), purpose-built for cybersecurity reasoning tasks as a proof of concept, to test my dataset to model pipeline. The model is trained to assist with red team tradecraft analysis, blue team detection reasoning, PowerShell/CMD/Bash command interpretation, network analysis (nmap, Wireshark, tcpdump), and Sigma rule evaluation.

The adapter targets a 3B-class parameter range using QLoRA 4-bit quantization, making it practical to run on consumer-grade hardware with 8GB VRAM. It is the first completed training run in a sequenced multi-model project that plans to also train Mistral 7B Instruct v0.3, Llama 3.1 8B Instruct, Qwen2.5 7B Instruct, DeepSeek-R1-Distill-Llama-8B, and Mistral NeMo 12B Instruct on the same corpus.


Model Details

Model Description

  • Developed by: Cybersecurirty Assistant Project (internal, single-operator lab)
  • Model type: Causal language model β€” LoRA adapter over Phi-3.5-mini-instruct
  • Language(s): English (technical/cybersecurity domain)
  • License: Adapter weights β€” internal use only; base model: MIT (microsoft/Phi-3.5-mini-instruct)
  • Finetuned from: microsoft/Phi-3.5-mini-instruct
  • Training framework: PEFT 0.18.1 / TRL / Transformers
  • Quantization: 4-bit NF4 (QLoRA) via bitsandbytes

Model Sources


Intended Uses

Direct Use

Cybersecurirty Assistant is designed for use in a controlled cybersecurity lab environment. It is intended to assist a trained security professional with:

  • Interpreting CLI command output across PowerShell (5.1 and 7.x), Windows CMD, and Linux Bash
  • Reasoning about red team tradecraft β€” including execution sequences, artifact footprints, stealth ratings, and cleanup requirements
  • Mapping adversarial techniques to MITRE ATT&CK framework IDs
  • Analyzing network telemetry from tools such as nmap, tcpdump, and Wireshark
  • Evaluating Sigma detection rules and blue team SOC workflows
  • Distinguishing true positive threat indicators from benign administrative activity (false positive reduction)

Downstream Use

The adapter is designed to be merged into its base model using PEFT merge_and_unload and served via Ollama + Open WebUI for network-accessible inference in a homelab environment. Downstream integration into an agentic scaffold with a human-in-the-loop validation layer is the intended deployment path.

Out-of-Scope Use

  • Autonomous offensive operations without human oversight. The model is not designed for unsupervised red team execution. All tool use requires human approval.
  • Production or enterprise deployment without further evaluation. The current adapter was trained on a small dataset (207 records, ~11 formatted at the time of Run 1) and is not yet at deployment-ready benchmark thresholds.
  • Novel zero-day exploitation or techniques outside the training distribution. The model will hallucinate outside its training domain.
  • Replacement of a qualified security analyst. This model is an analyst aid, not a substitute.

Bias, Risks, and Limitations

  • Dataset size: The initial training run used 207 total records, of which only 11 were correctly formatted and ingested. This is far below the project's target threshold (3,000–5,000+ formatted records for meaningful deployment readiness). The current model has thin domain coverage and should be treated as a proof-of-concept.
  • Hallucination risk: Like all LLMs, this model can produce confident but incorrect command syntax, flag definitions, or output interpretations β€” especially for tools, OS versions, or techniques outside the training distribution.
  • Schema coverage gaps: At the time of Run 1, 26 of 207 records were skipped due to unrecognized schemas. Format coverage was 87.4% by record count but only ~5.3% by formatted output (11/207).
  • Benchmark evaluation not yet functional: Run 1 scored 0/10 on the benchmark suite due to a schema mismatch in the benchmark evaluator. Benchmark results from Run 1 do not reflect model capability.
  • Platform scope: Training data focuses on Windows (PowerShell 5.1, PS 7.x, CMD) with supplementary Linux/Bash coverage. Cross-shell parity is incomplete. IPv6 coverage is minimal.
  • Red team content: This model has been trained on offensive security tradecraft. It is intended for authorized lab use by trained professionals only. It should not be distributed or used outside a controlled, authorized environment.

Recommendations

Users must be security professionals operating within authorized lab or engagement boundaries. The model requires a human validation layer for any tool execution suggestions. Do not use for autonomous operations or outside approved scope.


How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

base_model_id = "microsoft/Phi-3.5-mini-instruct"
adapter_path = "./adapter"  # path to saved LoRA adapter weights

tokenizer = AutoTokenizer.from_pretrained(base_model_id)
model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
model = PeftModel.from_pretrained(model, adapter_path)

# Example prompt
prompt = "What does `Get-NetTCPConnection | Where-Object State -eq 'Established'` return, and how should a SOC analyst interpret the output?"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(output[0], skip_special_tokens=True))

Training Details

Training Data

The training dataset is a synthetic cybersecurity corpus in JSONL format, organized into three epoch families:

Epoch Folder Contents Records
1 03_toolknowledge + 05_detection_rules Tool syntax references, Sigma/YARA detection rules, tradecraft matrix, subnetting, tcpdump/Wireshark reasoning 97
2 04_goldens Execution traces β€” success and failure cases for Linux privesc, Windows initial access, nmap, persistence, network analysis, evasion 87
3 06_contrast_pairs Ambiguity and intent reasoning β€” admin vs. malicious command pairs, Linux/Windows privesc contrast, cross-platform equivalence 23
Total 207

Domain coverage at time of Run 1:

Domain Records Target Coverage
PowerShell Administration ~50 550 9%
Red Team & Offensive Tradecraft ~50 220 23%
Linux / POSIX Operations ~60 150 40%
Windows Command Line ~45 85 53%
Network Analysis (nmap, tcpdump, Wireshark) ~30 150 20%
Cybersecurity Reasoning / Ambiguity ~60 200 30%
Sigma Detection Rules 35 200 18%
Blue Team / SOC Workflows ~40 200 20%
IPv4/IPv6 / Port / Service Identification ~10 150 7%

Source material includes extracted content from technical references covering PowerShell, Windows CMD, Linux/Bash, network analysis (Nmap, Wireshark, tcpdump), red team field manuals (RTFM), NIST cybersecurity frameworks, and Sigma detection rule libraries.

Benchmark data (07_benchmarks/) was explicitly excluded from training and held as a frozen evaluation set.

Training Procedure

Training uses Supervised Fine-Tuning (SFT) via TRL's SFTTrainer with QLoRA (4-bit NF4 quantization).

Curriculum structure:

  • Epoch 1: Tool knowledge + detection rules β†’ syntax and rule foundation
  • Epoch 2: Golden execution traces β†’ execution reality (success and failure)
  • Epoch 3: Contrast pairs β†’ ambiguity and intent reasoning

Training Hyperparameters (Run 1)

Parameter Value
Training regime bf16 mixed precision
LoRA rank (r) 16
LoRA alpha 32
LoRA dropout 0.05
Trainable parameters 8,912,896 / 2,018,053,120 (0.44%)
Max sequence length 1024 tokens
Epochs 3
Total training steps 69
Formatted / ingested examples 11 / 207
Warmup ratio configured (deprecated in v5.2 β€” migrated to warmup_steps)

Hardware (Run 1)

Component Specification
Machine Alienware M16 R2
CPU Intel Core Ultra 7 155H β€” 16 cores / 22 logical
GPU NVIDIA GeForce RTX 4070 Laptop GPU β€” 8GB GDDR6 VRAM
RAM 64GB system memory
OS Windows 11
Training time ~16 minutes (948.5 seconds)
Peak VRAM usage ~4.5GB
Peak GPU temp ~74Β°C

Evaluation

Testing Data

Benchmark evaluation uses a frozen evaluation set stored in 07_benchmarks/frozen_eval/. This dataset was never included in training data. Benchmark families cover:

  1. Tool syntax correctness
  2. Execution trace interpretation
  3. Failure diagnosis and recovery
  4. Ambiguity and intent reasoning (false positive discrimination)
  5. Tradecraft and MITRE ATT&CK mapping

Metrics

  • Per-epoch training loss
  • Mean token accuracy
  • Benchmark score (correct / total evaluated questions, across 5 families)
  • Benchmark delta (post-training score minus pre-training baseline)

Results (Run 1)

Metric Value
Epoch 1 loss 8.1072
Epoch 2 loss ~6.5
Epoch 3 loss 5.2204
Mean train loss 7.17
Mean token accuracy 0.3049
Loss reduction 54.5% (11.49 β†’ 5.22)
Benchmark score 0/10 β€” evaluator schema mismatch (not reflective of model capability)

Note: The benchmark evaluator in Run 1 failed to match benchmark records due to a schema format mismatch. Benchmark results from Run 1 are invalid. The benchmark handler was corrected for Run 6 onward.

Multi-Run Summary

Run Records Epochs Final Loss Benchmark Key Finding
Runs 1–3 207 3 5.0–6.2 0/10 (evaluator error) DynamicCache bug; only 11/207 records formatted
Run 4 502 5 2.32 7/10 (incoherent) Metadata contamination in V2 records
Run 5 479 5 4.18 7/10 (partial) Format oscillation V1 vs V2 style
Run 6 479 5 TBD TBD Format unified β€” benchmark handlers added

Project deployment threshold: β‰₯85% on Failure Diagnosis AND Ambiguity/Intent Reasoning benchmarks across a dataset of 3,000+ formatted records.

Current realistic competence levels by dataset size:

Formatted records Expected benchmark Agent readiness
~11 (Run 1) ~20–35% Not deployable
500 ~45–60% Experimental only
1,500 ~65–75% Limited deployment
3,000+ ~80–90% Approaching target
5,000+ 85%+ sustained Project threshold met

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: NVIDIA GeForce RTX 4070 Laptop GPU (8GB GDDR6)
  • Hours used: ~0.27 hours per training run (Run 1: 948.5 seconds)
  • Cloud Provider: N/A β€” local training on Alienware M16 R2
  • Compute Region: N/A β€” on-premises
  • Carbon Emitted: Minimal β€” single laptop GPU, ~16 minute runs

Technical Specifications

Model Architecture and Objective

  • Base architecture: Phi-3.5-mini-instruct (3.8B parameters, dense decoder-only transformer)
  • Adaptation method: Low-Rank Adaptation (LoRA) via PEFT
  • Quantization: 4-bit NF4 (QLoRA) β€” base model weights frozen and quantized; LoRA adapters trained in bf16
  • Objective: Supervised fine-tuning (SFT) on structured cybersecurity reasoning records
  • Attention: Standard attention (flash-attention not installed; attn_implementation='eager' used)
  • Context window: 1024 tokens (max_seq_length at training time)

Compute Infrastructure

  • Training hardware: Alienware M16 R2 β€” RTX 4070 Laptop GPU 8GB, Intel Core Ultra 7 155H, 64GB RAM
  • Training software: Python 3.13, PyTorch (CUDA), Transformers, PEFT 0.18.1, TRL, bitsandbytes
  • Logging: Custom PowerShell dashboard (dashboard.ps1), per-step JSONL loss logs, hardware telemetry via pynvml + psutil

Glossary

  • LoRA (Low-Rank Adaptation): A parameter-efficient fine-tuning method that trains small rank-decomposition matrices alongside frozen base model weights. Enables fine-tuning of large models on consumer hardware.
  • QLoRA: LoRA applied to a 4-bit quantized base model. Reduces VRAM requirements significantly while preserving most fine-tuning quality.
  • Golden records: Training records derived from lab-validated command execution β€” deterministic, verified, and clean. The most important record type for tool-correctness training.
  • Contrast pairs: Training records presenting ambiguous or similar-looking commands with different intents, used to sharpen the model's decision boundary between malicious and benign activity.
  • Epoch curriculum: The deliberate ordering of training data families across epochs β€” tool knowledge β†’ execution traces β†’ ambiguity reasoning β€” to build domain competence progressively.
  • MITRE ATT&CK: A publicly available knowledge base of adversary tactics, techniques, and procedures used to label red team training records.
  • Sigma rules: Generic signatures for SIEM systems used to detect adversarial behavior. A key domain in the blue team training corpus.

Framework Versions

  • PEFT: 0.18.1
  • Transformers: Current (as of April 2026)
  • TRL: Current
  • Python: 3.13
  • CUDA: Enabled (RTX 4070 Laptop)
  • bitsandbytes: 4-bit NF4 quantization support
Downloads last month
75
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for dpevzner/CyberSecurity_Microsoft_Phi3B

Adapter
(699)
this model

Paper for dpevzner/CyberSecurity_Microsoft_Phi3B