DFDS: EfficientNet-B2 Deepfake Detection Engine.
An open-source, production-ready deepfake detection model optimized for FinTech Identity Verification (KYC) pipelines. This model leverages a two-phase transfer learning approach using EfficientNet-B2, achieving a 96.00% accuracy on in-domain verification and demonstrating high resilience against generative attacks.
This repository contains the finalized Inference Weights (pytorch_model.bin) and a plug-and-play Python Execution Pipeline (inference.py) equipped with RetinaFace bounding logic.
Ecosystem Links.
- Training Dataset: ThinothW/Deepfake-Identity-Isolated-Dataset-PreP
- Full System Architecture (GitHub): DFDS-Final-Project
Quickstart: Running Inference.
This repository includes a standalone inference.py script that automatically handles RetinaFace mathematical cropping, Tensor Normalization, and universal hardware routing (CUDA, Apple MPS, or CPU).
1. Install Dependencies
pip install torch torchvision opencv-python retina-face pillow
2. Download and Run Download pytorch_model.bin and inference.py to your local directory.
from inference import predict_deepfake
# Pass any raw image to the pipeline
result = predict_deepfake("sample_id_photo.jpg")
print(result)
# Output: {'prediction': 'FAKE', 'fake_confidence': '98.40%', 'real_confidence': '1.60%'}
Performance Benchmarks (Run 5).
The model was evaluated using a strict identity-isolation protocol to ensure zero data leakage between Training and Testing splits.
1. In-Domain Generalization (Primary Test Set).
Evaluated on 21,324 isolated images from the primary Training distribution.
| Metric | Score |
|---|---|
| Accuracy | 96.00% |
| ROC-AUC | 99.30% |
| F1 Score | 0.9597 |
| Validation Loss | 0.1344 |
Class-Level Breakdown:
- Fake (0): Precision: 0.9563 | Recall: 0.9644 | F1: 0.9603
- Real (1): Precision: 0.9638 | Recall: 0.9555 | F1: 0.9597
2. Cross-Dataset Generalization (Zero-Shot)
To test real-world resilience, the model was evaluated on 6,216 completely unseen images spanning 8 novel manipulation methods from the DF40 Benchmark.
- Overall Accuracy: 67.95%
- Overall ROC-AUC: 71.50%
Context on Architectural Generalization: It is critical to note that this is a strict zero-shot evaluation. The data proves the model successfully learns the underlying mechanics of generation families rather than simply memorizing training datasets. When exposed to entirely unseen data that shares a generative family with its training distribution (example: Diffusion-based EFS like MidJourney and CollabDiff), the model maintains a massive 83%+ accuracy.
The performance drop-off at the bottom of the table is expected, as methods like StarGAN-v2 and StyleClip represent entirely foreign generation architectures (GANs and Latent Edits) that were mathematically absent from the training distribution. Ultimately, the model is highly lethal against modern FinTech threat vectors (EFS and modern Face Swaps).
| Generation Method | Type | Accuracy | F1 Score | Fake Detection % | Real Detection % |
|---|---|---|---|---|---|
| MidJourney | Diffusion / EFS | 93.13% | 0.9266 | 99.5% | 86.8% |
| CollabDiff | Diffusion / EFS | 83.13% | 0.8525 | 68.8% | 97.5% |
| HeyGen | Face Swap | 80.00% | 0.7050 | 51.2% | 89.8% |
| StarGAN | GAN | 74.81% | 0.7831 | 58.7% | 90.9% |
| DeepFaceLab | Face Swap | 69.37% | 0.6805 | 73.5% | 65.2% |
| StarGAN-v2 | GAN | 52.75% | 0.6730 | 8.2% | 97.2% |
| StyleClip | Latent Edit | 51.54% | 0.6696 | 4.9% | 98.2% |
| whichfaceisreal | GAN | 50.00% | 0.6667 | 0.0% | 100.0% |
3. Aggregated Threat Vector Performance.
| Threat Category | Support (N) | Accuracy | ROC-AUC |
|---|---|---|---|
| EFS (MidJourney + CollabDiff) | 1,542 | 87.93% | 0.9830 |
| Face Swap (HeyGen + DFL) | 1,502 | 69.97% | 0.7948 |
| Total FinTech Relevant (EFS + FS) | 3,044 | 79.07% | 0.8901 |
Technical Architecture.
- Backbone: EfficientNet-B2 (Pre-trained on ImageNet)
- Custom Head: Dropout (p=0.3) -> Linear (1408 to 2 nodes)
- Loss Function: CrossEntropyLoss
- Optimizer: AdamW
- Input Resolution: 260x260 RGB (Normalized to ImageNet mean/std)
- Face Extraction: RetinaFace (Confidence Threshold > 0.90)
Author.
Thinod Wickramasinghe · University of Plymouth · 2026
GitHub: https://github.com/thinothw
Project Supervisor - Dr. Rasika Ranaweera.