Cybersecurirty Assistant — Cybersecurity Reasoning Adapter (Phi-3.5 Mini, LoRA)

Model Summary

Cybersecurirty Assistant is a LoRA fine-tuned adapter for microsoft/Phi-3.5-mini-instruct (3.8B parameters), purpose-built for cybersecurity reasoning tasks as a proof of concept, to test my dataset to model pipeline. The model is trained to assist with red team tradecraft analysis, blue team detection reasoning, PowerShell/CMD/Bash command interpretation, network analysis (nmap, Wireshark, tcpdump), and Sigma rule evaluation.

The adapter targets a 3B-class parameter range using QLoRA 4-bit quantization, making it practical to run on consumer-grade hardware with 8GB VRAM. It is the first completed training run in a sequenced multi-model project that plans to also train Mistral 7B Instruct v0.3, Llama 3.1 8B Instruct, Qwen2.5 7B Instruct, DeepSeek-R1-Distill-Llama-8B, and Mistral NeMo 12B Instruct on the same corpus.

Model Details

Model Description

Developed by: Cybersecurirty Assistant Project (internal, single-operator lab)
Model type: Causal language model — LoRA adapter over Phi-3.5-mini-instruct
Language(s): English (technical/cybersecurity domain)
License: Adapter weights — internal use only; base model: MIT (microsoft/Phi-3.5-mini-instruct)
Finetuned from: microsoft/Phi-3.5-mini-instruct
Training framework: PEFT 0.18.1 / TRL / Transformers
Quantization: 4-bit NF4 (QLoRA) via bitsandbytes

Model Sources

Base model repository: https://huggingface.co/microsoft/Phi-3.5-mini-instruct
Training scripts: train_lora.py, dashboard.ps1 (internal lab)

Intended Uses

Direct Use

Cybersecurirty Assistant is designed for use in a controlled cybersecurity lab environment. It is intended to assist a trained security professional with:

Interpreting CLI command output across PowerShell (5.1 and 7.x), Windows CMD, and Linux Bash
Reasoning about red team tradecraft — including execution sequences, artifact footprints, stealth ratings, and cleanup requirements
Mapping adversarial techniques to MITRE ATT&CK framework IDs
Analyzing network telemetry from tools such as nmap, tcpdump, and Wireshark
Evaluating Sigma detection rules and blue team SOC workflows
Distinguishing true positive threat indicators from benign administrative activity (false positive reduction)

Downstream Use

The adapter is designed to be merged into its base model using PEFT merge_and_unload and served via Ollama + Open WebUI for network-accessible inference in a homelab environment. Downstream integration into an agentic scaffold with a human-in-the-loop validation layer is the intended deployment path.

Out-of-Scope Use

Autonomous offensive operations without human oversight. The model is not designed for unsupervised red team execution. All tool use requires human approval.
Production or enterprise deployment without further evaluation. The current adapter was trained on a small dataset (207 records, ~11 formatted at the time of Run 1) and is not yet at deployment-ready benchmark thresholds.
Novel zero-day exploitation or techniques outside the training distribution. The model will hallucinate outside its training domain.
Replacement of a qualified security analyst. This model is an analyst aid, not a substitute.

Bias, Risks, and Limitations

Dataset size: The initial training run used 207 total records, of which only 11 were correctly formatted and ingested. This is far below the project's target threshold (3,000–5,000+ formatted records for meaningful deployment readiness). The current model has thin domain coverage and should be treated as a proof-of-concept.
Hallucination risk: Like all LLMs, this model can produce confident but incorrect command syntax, flag definitions, or output interpretations — especially for tools, OS versions, or techniques outside the training distribution.
Schema coverage gaps: At the time of Run 1, 26 of 207 records were skipped due to unrecognized schemas. Format coverage was 87.4% by record count but only ~5.3% by formatted output (11/207).
Benchmark evaluation not yet functional: Run 1 scored 0/10 on the benchmark suite due to a schema mismatch in the benchmark evaluator. Benchmark results from Run 1 do not reflect model capability.
Platform scope: Training data focuses on Windows (PowerShell 5.1, PS 7.x, CMD) with supplementary Linux/Bash coverage. Cross-shell parity is incomplete. IPv6 coverage is minimal.
Red team content: This model has been trained on offensive security tradecraft. It is intended for authorized lab use by trained professionals only. It should not be distributed or used outside a controlled, authorized environment.

Recommendations

Users must be security professionals operating within authorized lab or engagement boundaries. The model requires a human validation layer for any tool execution suggestions. Do not use for autonomous operations or outside approved scope.

How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

base_model_id = "microsoft/Phi-3.5-mini-instruct"
adapter_path = "./adapter"  # path to saved LoRA adapter weights

tokenizer = AutoTokenizer.from_pretrained(base_model_id)
model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
model = PeftModel.from_pretrained(model, adapter_path)

# Example prompt
prompt = "What does `Get-NetTCPConnection | Where-Object State -eq 'Established'` return, and how should a SOC analyst interpret the output?"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(output[0], skip_special_tokens=True))

Training Details

Training Data

The training dataset is a synthetic cybersecurity corpus in JSONL format, organized into three epoch families:

Epoch	Folder	Contents	Records
1	`03_toolknowledge` + `05_detection_rules`	Tool syntax references, Sigma/YARA detection rules, tradecraft matrix, subnetting, tcpdump/Wireshark reasoning	97
2	`04_goldens`	Execution traces — success and failure cases for Linux privesc, Windows initial access, nmap, persistence, network analysis, evasion	87
3	`06_contrast_pairs`	Ambiguity and intent reasoning — admin vs. malicious command pairs, Linux/Windows privesc contrast, cross-platform equivalence	23
Total			207

Domain coverage at time of Run 1:

Domain	Records	Target	Coverage
PowerShell Administration	~50	550	9%
Red Team & Offensive Tradecraft	~50	220	23%
Linux / POSIX Operations	~60	150	40%
Windows Command Line	~45	85	53%
Network Analysis (nmap, tcpdump, Wireshark)	~30	150	20%
Cybersecurity Reasoning / Ambiguity	~60	200	30%
Sigma Detection Rules	35	200	18%
Blue Team / SOC Workflows	~40	200	20%
IPv4/IPv6 / Port / Service Identification	~10	150	7%

Source material includes extracted content from technical references covering PowerShell, Windows CMD, Linux/Bash, network analysis (Nmap, Wireshark, tcpdump), red team field manuals (RTFM), NIST cybersecurity frameworks, and Sigma detection rule libraries.

Benchmark data (07_benchmarks/) was explicitly excluded from training and held as a frozen evaluation set.

Training Procedure

Training uses Supervised Fine-Tuning (SFT) via TRL's SFTTrainer with QLoRA (4-bit NF4 quantization).

Curriculum structure:

Epoch 1: Tool knowledge + detection rules → syntax and rule foundation
Epoch 2: Golden execution traces → execution reality (success and failure)
Epoch 3: Contrast pairs → ambiguity and intent reasoning

Training Hyperparameters (Run 1)

Parameter	Value
Training regime	bf16 mixed precision
LoRA rank (r)	16
LoRA alpha	32
LoRA dropout	0.05
Trainable parameters	8,912,896 / 2,018,053,120 (0.44%)
Max sequence length	1024 tokens
Epochs	3
Total training steps	69
Formatted / ingested examples	11 / 207
Warmup ratio	configured (deprecated in v5.2 — migrated to warmup_steps)

Hardware (Run 1)

Component	Specification
Machine	Alienware M16 R2
CPU	Intel Core Ultra 7 155H — 16 cores / 22 logical
GPU	NVIDIA GeForce RTX 4070 Laptop GPU — 8GB GDDR6 VRAM
RAM	64GB system memory
OS	Windows 11
Training time	~16 minutes (948.5 seconds)
Peak VRAM usage	~4.5GB
Peak GPU temp	~74°C

Evaluation

Testing Data

Benchmark evaluation uses a frozen evaluation set stored in 07_benchmarks/frozen_eval/. This dataset was never included in training data. Benchmark families cover:

Tool syntax correctness
Execution trace interpretation
Failure diagnosis and recovery
Ambiguity and intent reasoning (false positive discrimination)
Tradecraft and MITRE ATT&CK mapping

Metrics

Per-epoch training loss
Mean token accuracy
Benchmark score (correct / total evaluated questions, across 5 families)
Benchmark delta (post-training score minus pre-training baseline)

Results (Run 1)

Metric	Value
Epoch 1 loss	8.1072
Epoch 2 loss	~6.5
Epoch 3 loss	5.2204
Mean train loss	7.17
Mean token accuracy	0.3049
Loss reduction	54.5% (11.49 → 5.22)
Benchmark score	0/10 — evaluator schema mismatch (not reflective of model capability)

Note: The benchmark evaluator in Run 1 failed to match benchmark records due to a schema format mismatch. Benchmark results from Run 1 are invalid. The benchmark handler was corrected for Run 6 onward.

Multi-Run Summary

Run	Records	Epochs	Final Loss	Benchmark	Key Finding
Runs 1–3	207	3	5.0–6.2	0/10 (evaluator error)	DynamicCache bug; only 11/207 records formatted
Run 4	502	5	2.32	7/10 (incoherent)	Metadata contamination in V2 records
Run 5	479	5	4.18	7/10 (partial)	Format oscillation V1 vs V2 style
Run 6	479	5	TBD	TBD	Format unified — benchmark handlers added

Project deployment threshold: ≥85% on Failure Diagnosis AND Ambiguity/Intent Reasoning benchmarks across a dataset of 3,000+ formatted records.

Current realistic competence levels by dataset size:

Formatted records	Expected benchmark	Agent readiness
~11 (Run 1)	~20–35%	Not deployable
500	~45–60%	Experimental only
1,500	~65–75%	Limited deployment
3,000+	~80–90%	Approaching target
5,000+	85%+ sustained	Project threshold met

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: NVIDIA GeForce RTX 4070 Laptop GPU (8GB GDDR6)
Hours used: ~0.27 hours per training run (Run 1: 948.5 seconds)
Cloud Provider: N/A — local training on Alienware M16 R2
Compute Region: N/A — on-premises
Carbon Emitted: Minimal — single laptop GPU, ~16 minute runs

Technical Specifications

Model Architecture and Objective

Base architecture: Phi-3.5-mini-instruct (3.8B parameters, dense decoder-only transformer)
Adaptation method: Low-Rank Adaptation (LoRA) via PEFT
Quantization: 4-bit NF4 (QLoRA) — base model weights frozen and quantized; LoRA adapters trained in bf16
Objective: Supervised fine-tuning (SFT) on structured cybersecurity reasoning records
Attention: Standard attention (flash-attention not installed; attn_implementation='eager' used)
Context window: 1024 tokens (max_seq_length at training time)

Compute Infrastructure

Training hardware: Alienware M16 R2 — RTX 4070 Laptop GPU 8GB, Intel Core Ultra 7 155H, 64GB RAM
Training software: Python 3.13, PyTorch (CUDA), Transformers, PEFT 0.18.1, TRL, bitsandbytes
Logging: Custom PowerShell dashboard (dashboard.ps1), per-step JSONL loss logs, hardware telemetry via pynvml + psutil

Glossary

LoRA (Low-Rank Adaptation): A parameter-efficient fine-tuning method that trains small rank-decomposition matrices alongside frozen base model weights. Enables fine-tuning of large models on consumer hardware.
QLoRA: LoRA applied to a 4-bit quantized base model. Reduces VRAM requirements significantly while preserving most fine-tuning quality.
Golden records: Training records derived from lab-validated command execution — deterministic, verified, and clean. The most important record type for tool-correctness training.
Contrast pairs: Training records presenting ambiguous or similar-looking commands with different intents, used to sharpen the model's decision boundary between malicious and benign activity.
Epoch curriculum: The deliberate ordering of training data families across epochs — tool knowledge → execution traces → ambiguity reasoning — to build domain competence progressively.
MITRE ATT&CK: A publicly available knowledge base of adversary tactics, techniques, and procedures used to label red team training records.
Sigma rules: Generic signatures for SIEM systems used to detect adversarial behavior. A key domain in the blue team training corpus.

Framework Versions

PEFT: 0.18.1
Transformers: Current (as of April 2026)
TRL: Current
Python: 3.13
CUDA: Enabled (RTX 4070 Laptop)
bitsandbytes: 4-bit NF4 quantization support

Downloads last month: 75

Model tree for dpevzner/CyberSecurity_Microsoft_Phi3B

Base model

microsoft/Phi-3.5-mini-instruct

Adapter

(699)

this model

Paper for dpevzner/CyberSecurity_Microsoft_Phi3B

Quantifying the Carbon Emissions of Machine Learning

Paper • 1910.09700 • Published Oct 21, 2019 • 43