KestrelNet — 1,059-Parameter Fraud Classifier

A fully-connected neural network for real-time transaction fraud detection. Built from scratch with pure NumPy — no PyTorch, no TensorFlow, no ONNX runtime. The entire model fits in a single tweet.

Why This Exists

Most fraud detection models are overbuilt. We wanted to find the floor: what's the smallest model that still works? Turns out, 1,059 parameters gets you to 91.6% accuracy with sub-microsecond inference on commodity hardware.

Performance

Metric	Value
Accuracy	91.6%
Parameters	1,059
Model size	8.3 KB
Inference latency	~5 μs (CPU)
Throughput	~190,000 inferences/sec
Dependencies	NumPy only

For context, a single GPT-2 attention head has more parameters than this entire model.

Architecture

Input (14 features) → Dense(32, ReLU) → Dense(16, ReLU) → Dense(3, Softmax)

Three layers. No batch norm, no attention, no residual connections. Just matrix multiplies and ReLU.

Training uses analytic backpropagation — full gradient computation without autograd. Every partial derivative is derived by hand and implemented directly. This makes the training loop ~10x faster than equivalent PyTorch code for models this size.

GullNet Variant

We also offer a GullNet variant that replaces standard dot products with multivector products, giving the network native access to rotations, reflections, and scaling in a single operation — useful when feature interactions have geometric structure. The GullNet variant has more parameters but can capture complex feature relationships that FC nets miss.

Input Features

The model expects a 14-dimensional normalized feature vector:

Index	Feature	Normalization
0	`amount_vs_avg`	Transaction amount / 90-day average
1-2	`hour_sin`, `hour_cos`	Cyclical encoding of transaction hour
3-4	`day_sin`, `day_cos`	Cyclical encoding of day of week
5	`location_delta`	Std deviations from usual location
6	`velocity_1h`	Transactions in past hour / 10, clipped
7	`velocity_24h`	Transactions in past 24h / 30, clipped
8	`merchant_risk`	Merchant category risk score [0-1]
9	`international`	Cross-border transaction (0/1)
10	`card_present`	Physical card used (0/1)
11	`device_match`	Known device (0/1)
12	`account_age_norm`	Account age / 3650 days
13	`prev_fraud_score`	Historical fraud rate [0-1]

Output

Three-class softmax: [legitimate, review, fraudulent]

Threshold modes control the decision boundary:

Standard — Balanced precision/recall
Conservative — Flags more transactions (fewer false negatives)
Strict — Flags fewer (fewer false positives)

Benchmarks — Public Datasets

KestrelNet and GoshawkNet evaluated on public Kaggle datasets. All results independently reproducible.

Dataset	Task	Accuracy	F1 / AUC	Params	Latency	Source
ECG Heartbeat (MIT-BIH)	5-class arrhythmia	97.2%	F1 0.853	12,756	56μs	shayanfazeli/heartbeat
EEG Emotions	3-class sentiment	99.1%	F1 0.991	163,788	1.3ms	birdy654/eeg-brainwave-dataset-feeling-emotions
EEG Eye State	Binary open/closed	94.2%	AUC 0.986	1,576	17μs	robikscube/eye-state-classification-eeg-dataset
Seizure Prediction (Bonn)	Binary seizure	97.1%	AUC 0.988	12,072	—	harunshimanto/epileptic-seizure-recognition
HAR Smartphones (UCI)	6-class activity	94.9%	F1 0.949	15,416	70μs	uciml/human-activity-recognition-with-smartphones
Fraud Detection	3-class fraud	91.6%	—	1,059	5μs	Proprietary

All benchmarks run on CPU. No GPU required. Pure NumPy inference.

Parameter Efficiency

For comparison, typical models on these datasets:

Dataset	Typical CNN/LSTM	KestrelNet/GoshawkNet	Reduction
ECG Heartbeat	500K–2M params	12,756	40–160x smaller
EEG Emotions	1M+ params	163,788	6x smaller
EEG Eye State	100K+ params	1,576	63x smaller
HAR Smartphones	200K–1M params	15,416	13–65x smaller

Quick Start

import numpy as np
from kestrelnet import KestrelNet

model = KestrelNet.from_pretrained("kestrelnet/fraud-classifier")
scores = model.predict([1.2, 14, 2, 0.1, 1, 3, 0.05, False, True, True, 365, 0.0])
# {'legitimate': 0.983, 'review': 0.017, 'fraudulent': 0.000}

Intended Use

Real-time fraud screening for payment processors
Pre-filter before heavier ML models (ensemble first stage)
Edge deployment where GPU is unavailable
Educational reference for from-scratch neural networks

Limitations

Trained on synthetic/proprietary data — accuracy on your distribution will vary
14 fixed features — cannot ingest raw transaction logs directly
No sequence modeling — treats each transaction independently
Small capacity means it cannot memorize complex fraud patterns

How to Cite

@misc{kestrelnet2026,
  title={KestrelNet: Sub-Kilobyte Neural Fraud Classifier},
  author={KestrelNet Team},
  year={2026},
  url={https://huggingface.co/kestrelnet/fraud-classifier}
}

Downloads last month: 31

Space using reddysama/gnaninet-fraud-classifier 1

Evaluation results

Accuracy
self-reported

0.916
Inference Latency
self-reported

0.005ms
Parameters
self-reported

1059.000