KestrelNet β€” 1,059-Parameter Fraud Classifier

A fully-connected neural network for real-time transaction fraud detection. Built from scratch with pure NumPy β€” no PyTorch, no TensorFlow, no ONNX runtime. The entire model fits in a single tweet.

Why This Exists

Most fraud detection models are overbuilt. We wanted to find the floor: what's the smallest model that still works? Turns out, 1,059 parameters gets you to 91.6% accuracy with sub-microsecond inference on commodity hardware.

Performance

Metric Value
Accuracy 91.6%
Parameters 1,059
Model size 8.3 KB
Inference latency ~5 ΞΌs (CPU)
Throughput ~190,000 inferences/sec
Dependencies NumPy only

For context, a single GPT-2 attention head has more parameters than this entire model.

Architecture

Input (14 features) β†’ Dense(32, ReLU) β†’ Dense(16, ReLU) β†’ Dense(3, Softmax)

Three layers. No batch norm, no attention, no residual connections. Just matrix multiplies and ReLU.

Training uses analytic backpropagation β€” full gradient computation without autograd. Every partial derivative is derived by hand and implemented directly. This makes the training loop ~10x faster than equivalent PyTorch code for models this size.

GullNet Variant

We also offer a GullNet variant that replaces standard dot products with multivector products, giving the network native access to rotations, reflections, and scaling in a single operation β€” useful when feature interactions have geometric structure. The GullNet variant has more parameters but can capture complex feature relationships that FC nets miss.

Input Features

The model expects a 14-dimensional normalized feature vector:

Index Feature Normalization
0 amount_vs_avg Transaction amount / 90-day average
1-2 hour_sin, hour_cos Cyclical encoding of transaction hour
3-4 day_sin, day_cos Cyclical encoding of day of week
5 location_delta Std deviations from usual location
6 velocity_1h Transactions in past hour / 10, clipped
7 velocity_24h Transactions in past 24h / 30, clipped
8 merchant_risk Merchant category risk score [0-1]
9 international Cross-border transaction (0/1)
10 card_present Physical card used (0/1)
11 device_match Known device (0/1)
12 account_age_norm Account age / 3650 days
13 prev_fraud_score Historical fraud rate [0-1]

Output

Three-class softmax: [legitimate, review, fraudulent]

Threshold modes control the decision boundary:

  • Standard β€” Balanced precision/recall
  • Conservative β€” Flags more transactions (fewer false negatives)
  • Strict β€” Flags fewer (fewer false positives)

Benchmarks β€” Public Datasets

KestrelNet and GoshawkNet evaluated on public Kaggle datasets. All results independently reproducible.

Dataset Task Accuracy F1 / AUC Params Latency Source
ECG Heartbeat (MIT-BIH) 5-class arrhythmia 97.2% F1 0.853 12,756 56ΞΌs shayanfazeli/heartbeat
EEG Emotions 3-class sentiment 99.1% F1 0.991 163,788 1.3ms birdy654/eeg-brainwave-dataset-feeling-emotions
EEG Eye State Binary open/closed 94.2% AUC 0.986 1,576 17ΞΌs robikscube/eye-state-classification-eeg-dataset
Seizure Prediction (Bonn) Binary seizure 97.1% AUC 0.988 12,072 β€” harunshimanto/epileptic-seizure-recognition
HAR Smartphones (UCI) 6-class activity 94.9% F1 0.949 15,416 70ΞΌs uciml/human-activity-recognition-with-smartphones
Fraud Detection 3-class fraud 91.6% β€” 1,059 5ΞΌs Proprietary

All benchmarks run on CPU. No GPU required. Pure NumPy inference.

Parameter Efficiency

For comparison, typical models on these datasets:

Dataset Typical CNN/LSTM KestrelNet/GoshawkNet Reduction
ECG Heartbeat 500K–2M params 12,756 40–160x smaller
EEG Emotions 1M+ params 163,788 6x smaller
EEG Eye State 100K+ params 1,576 63x smaller
HAR Smartphones 200K–1M params 15,416 13–65x smaller

Quick Start

import numpy as np
from kestrelnet import KestrelNet

model = KestrelNet.from_pretrained("kestrelnet/fraud-classifier")
scores = model.predict([1.2, 14, 2, 0.1, 1, 3, 0.05, False, True, True, 365, 0.0])
# {'legitimate': 0.983, 'review': 0.017, 'fraudulent': 0.000}

Intended Use

  • Real-time fraud screening for payment processors
  • Pre-filter before heavier ML models (ensemble first stage)
  • Edge deployment where GPU is unavailable
  • Educational reference for from-scratch neural networks

Limitations

  • Trained on synthetic/proprietary data β€” accuracy on your distribution will vary
  • 14 fixed features β€” cannot ingest raw transaction logs directly
  • No sequence modeling β€” treats each transaction independently
  • Small capacity means it cannot memorize complex fraud patterns

How to Cite

@misc{kestrelnet2026,
  title={KestrelNet: Sub-Kilobyte Neural Fraud Classifier},
  author={KestrelNet Team},
  year={2026},
  url={https://huggingface.co/kestrelnet/fraud-classifier}
}
Downloads last month
31
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Space using reddysama/gnaninet-fraud-classifier 1

Evaluation results