KestrelNet β 1,059-Parameter Fraud Classifier
A fully-connected neural network for real-time transaction fraud detection. Built from scratch with pure NumPy β no PyTorch, no TensorFlow, no ONNX runtime. The entire model fits in a single tweet.
Why This Exists
Most fraud detection models are overbuilt. We wanted to find the floor: what's the smallest model that still works? Turns out, 1,059 parameters gets you to 91.6% accuracy with sub-microsecond inference on commodity hardware.
Performance
| Metric | Value |
|---|---|
| Accuracy | 91.6% |
| Parameters | 1,059 |
| Model size | 8.3 KB |
| Inference latency | ~5 ΞΌs (CPU) |
| Throughput | ~190,000 inferences/sec |
| Dependencies | NumPy only |
For context, a single GPT-2 attention head has more parameters than this entire model.
Architecture
Input (14 features) β Dense(32, ReLU) β Dense(16, ReLU) β Dense(3, Softmax)
Three layers. No batch norm, no attention, no residual connections. Just matrix multiplies and ReLU.
Training uses analytic backpropagation β full gradient computation without autograd. Every partial derivative is derived by hand and implemented directly. This makes the training loop ~10x faster than equivalent PyTorch code for models this size.
GullNet Variant
We also offer a GullNet variant that replaces standard dot products with multivector products, giving the network native access to rotations, reflections, and scaling in a single operation β useful when feature interactions have geometric structure. The GullNet variant has more parameters but can capture complex feature relationships that FC nets miss.
Input Features
The model expects a 14-dimensional normalized feature vector:
| Index | Feature | Normalization |
|---|---|---|
| 0 | amount_vs_avg |
Transaction amount / 90-day average |
| 1-2 | hour_sin, hour_cos |
Cyclical encoding of transaction hour |
| 3-4 | day_sin, day_cos |
Cyclical encoding of day of week |
| 5 | location_delta |
Std deviations from usual location |
| 6 | velocity_1h |
Transactions in past hour / 10, clipped |
| 7 | velocity_24h |
Transactions in past 24h / 30, clipped |
| 8 | merchant_risk |
Merchant category risk score [0-1] |
| 9 | international |
Cross-border transaction (0/1) |
| 10 | card_present |
Physical card used (0/1) |
| 11 | device_match |
Known device (0/1) |
| 12 | account_age_norm |
Account age / 3650 days |
| 13 | prev_fraud_score |
Historical fraud rate [0-1] |
Output
Three-class softmax: [legitimate, review, fraudulent]
Threshold modes control the decision boundary:
- Standard β Balanced precision/recall
- Conservative β Flags more transactions (fewer false negatives)
- Strict β Flags fewer (fewer false positives)
Benchmarks β Public Datasets
KestrelNet and GoshawkNet evaluated on public Kaggle datasets. All results independently reproducible.
| Dataset | Task | Accuracy | F1 / AUC | Params | Latency | Source |
|---|---|---|---|---|---|---|
| ECG Heartbeat (MIT-BIH) | 5-class arrhythmia | 97.2% | F1 0.853 | 12,756 | 56ΞΌs | shayanfazeli/heartbeat |
| EEG Emotions | 3-class sentiment | 99.1% | F1 0.991 | 163,788 | 1.3ms | birdy654/eeg-brainwave-dataset-feeling-emotions |
| EEG Eye State | Binary open/closed | 94.2% | AUC 0.986 | 1,576 | 17ΞΌs | robikscube/eye-state-classification-eeg-dataset |
| Seizure Prediction (Bonn) | Binary seizure | 97.1% | AUC 0.988 | 12,072 | β | harunshimanto/epileptic-seizure-recognition |
| HAR Smartphones (UCI) | 6-class activity | 94.9% | F1 0.949 | 15,416 | 70ΞΌs | uciml/human-activity-recognition-with-smartphones |
| Fraud Detection | 3-class fraud | 91.6% | β | 1,059 | 5ΞΌs | Proprietary |
All benchmarks run on CPU. No GPU required. Pure NumPy inference.
Parameter Efficiency
For comparison, typical models on these datasets:
| Dataset | Typical CNN/LSTM | KestrelNet/GoshawkNet | Reduction |
|---|---|---|---|
| ECG Heartbeat | 500Kβ2M params | 12,756 | 40β160x smaller |
| EEG Emotions | 1M+ params | 163,788 | 6x smaller |
| EEG Eye State | 100K+ params | 1,576 | 63x smaller |
| HAR Smartphones | 200Kβ1M params | 15,416 | 13β65x smaller |
Quick Start
import numpy as np
from kestrelnet import KestrelNet
model = KestrelNet.from_pretrained("kestrelnet/fraud-classifier")
scores = model.predict([1.2, 14, 2, 0.1, 1, 3, 0.05, False, True, True, 365, 0.0])
# {'legitimate': 0.983, 'review': 0.017, 'fraudulent': 0.000}
Intended Use
- Real-time fraud screening for payment processors
- Pre-filter before heavier ML models (ensemble first stage)
- Edge deployment where GPU is unavailable
- Educational reference for from-scratch neural networks
Limitations
- Trained on synthetic/proprietary data β accuracy on your distribution will vary
- 14 fixed features β cannot ingest raw transaction logs directly
- No sequence modeling β treats each transaction independently
- Small capacity means it cannot memorize complex fraud patterns
How to Cite
@misc{kestrelnet2026,
title={KestrelNet: Sub-Kilobyte Neural Fraud Classifier},
author={KestrelNet Team},
year={2026},
url={https://huggingface.co/kestrelnet/fraud-classifier}
}
- Downloads last month
- 31
Space using reddysama/gnaninet-fraud-classifier 1
Evaluation results
- Accuracyself-reported0.916
- Inference Latencyself-reported0.005ms
- Parametersself-reported1059.000