Update README.md

1aeca51 verified 4 months ago

6.32 kB

	---
	license: mit
	datasets:
	- Parveshiiii/AI-vs-Real
	base_model:
	- microsoft/swinv2-tiny-patch4-window16-256
	pipeline_tag: image-classification
	library_name: transformers
	tags:
	- safety
	- Modotte
	- SoTA
	---
	# Modotte

	<p align="center">
	<img
	src="https://cdn-uploads.huggingface.co/production/uploads/677fcdf29b9a9863eba3f29f/m7ddTjuxxLUntdXVk0t5N.png"
	alt="AIRealNet Banner"
	width="90%"
	style="border-radius:15px;"
	/>
	</p>

	---

	- [GitHub Repository](https://github.com/XenArcAI/AIRealNet)
	- [Live Demo](https://huggingface.co/spaces/Parveshiiii/AIRealNet)

	## Overview

	In an era of rapidly advancing AI-generated imagery, deepfakes, and synthetic media, the need for reliable detection tools has never been higher. AIRealNet is a binary image classifier explicitly designed to distinguish AI-generated images from real human photographs. This model is optimized to detect conventional AI-generated content while adhering to strict privacy standards—avoiding personal or sensitive images.

	* Class 0: AI-generated image
	* Class 1: Real human image

	By leveraging the robust SwinV2 Tiny architecture as its backbone, AIRealNet achieves a high degree of accuracy while remaining lightweight enough for practical deployment.

	---

	## Key Features

	1. High Accuracy on Public Datasets:
	Despite using a 14k-image fine-tuning split(Part of main fine tuning split), AIRealNet demonstrates exceptional accuracy and robustness in detecting AI-generated images.

	2. Balanced Training Split:
	The dataset contains a balanced number of AI-generated and real images, ensuring unbiased training and minimizing class imbalance issues.

	* AI-Generated: 60%
	* Human-Images: 40%

	4. Ethical Design:
	No personal photos were included, even if edited or AI-modified, respecting privacy and ethical AI principles.

	5. Fast and Scalable:
	Based on a transformer vision model, AIRealNet can be deployed efficiently in both research and production environments.

	---

	## Training Data

	* Dataset: `Parveshiiii/AI-vs-Real` (open-sourced subset of main dataset )
	* Size: 14k images (balanced between AI and human)
	* Split: Used the train split for fine-tuning; validation performed on a separate balanced subset.
	* Notes: Images sourced from public datasets and AI generation tools. Edited personal photos were intentionally excluded.

	---

	## Limitations

	While AIRealNet performs exceptionally well on typical AI-generated images, users should note:

	1. Subtle Edits: The model struggles with nano-scale edits or ultra-precise modifications, like “nano banana” edits.
	2. Edited Personal Images(over precise): Images of real people that have been AI-modified are not detected, aligning with privacy and ethical guidelines.
	3. Domain Generalization: Performance may vary on images from completely unseen AI generators or extremely unconventional content.

	---

	## Performance Metrics

	> Metrics shown are from Epoch 2, chosen to illustrate stable performance after fine-tuning.

	<p align="center">
	<img
	src="https://cdn-uploads.huggingface.co/production/uploads/677fcdf29b9a9863eba3f29f/3NVa0KLX0iAxTP2e6IlGH.png"
	alt="AIRealNet Banner"
	width="90%"
	style="border-radius:15px;"
	/>
	</p>

	Note: Extremely low loss and high accuracy are due to the controlled dataset environment. Real-world performance may be lower depending on the image domain.(In our testing this is model is over accurate despite it can't detect Nano-Banana images(only edited fully generated images can be detected over accurately))

	---

	## Demo and Usage

	1. Installing dependecies

	```python
	pip install -U transformers
	```
	2. Loading and running a demo

	```python
	from transformers import pipeline

	pipe = pipeline("image-classification", model="Modotte/AIRealNet")
	pipe("https://cdn-uploads.huggingface.co/production/uploads/677fcdf29b9a9863eba3f29f/eVkKUTdiInUl6pbIUghQC.png")# example image
	```
	# Demo

	* Given Image(Checkout Maths best filtered dataset focused on reasoning on Modotte)

	<p align="center">
	<img
	src="https://cdn-uploads.huggingface.co/production/uploads/677fcdf29b9a9863eba3f29f/eVkKUTdiInUl6pbIUghQC.png"
	alt="AIRealNet Banner"
	width="90%"
	style="border-radius:15px;"
	/>
	</p>

	* Model Output

	```bash
	[{'label': 'artificial', 'score': 0.9865425825119019},
	{'label': 'real', 'score': 0.013457471504807472}]
	```
	Note: its correct as the image was generated by a diffusion model

	---

	## Intended Use

	* Detect AI-generated imagery on social media, research publications, and digital media platforms.
	* Assist content moderators, researchers, and fact-checkers in identifying synthetic media.
	* Not intended for legal verification without human corroboration.

	---

	## Ethical Considerations

	* Privacy-first Approach: Personal photos, even if AI-edited, were excluded.
	* Responsible Deployment: Users should combine model predictions with human review to avoid false positives or negatives.
	* Transparency: The model card openly communicates its limitations and dataset design to prevent misuse.

	---

	## How It Works

	1. Images are preprocessed and resized to `256x256`.
	2. Features are extracted using the SwinV2 Tiny vision transformer backbone.
	3. A binary classification head outputs probabilities for AI-generated vs real human images.
	4. Predictions are interpreted as class 0 (AI) or class 1 (Human).

	---

	## Future Work

	Future iterations aim to:

	* Improve detection of subtle AI-generated edits and “nano banana” modifications.
	* Expand training data with diverse AI generators to enhance generalization.
	* Explore multi-modal detection capabilities (e.g., video, metadata, and image combined).

	---

	### Citation
	```bibtex
	@misc{Modotte_AIRealNet_2025,
	title={AIRealNet: A Fine-Tuned Vision Transformer for Detecting AI-Generated vs Real Human Images},
	author={Parvesh Rawal},
	publisher={Hugging Face},
	year={2025},
	url={https://huggingface.co/Modotte/AIRealNet}
	}
	```

	## References

	* Microsoft SwinV2 Tiny: [https://github.com/microsoft/Swin-Transformer](https://github.com/microsoft/Swin-Transformer)
	* Parveshiiii/AI-vs-Real dataset (subset): Open-sourced by our team member

	---