You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

SSUPER-AIGID

On-device AI-generated image detector — 🥈 2nd place, LPCVC 2026 Track 3 (IEEE Low-Power Computer Vision Challenge, ECV Workshop @ CVPR 2026).

A vision-language model that not only classifies whether an image is AI-generated, but also explains its reasoning across 8 forensic criteria (lighting, edges, texture, perspective, physical plausibility, text, human elements, material detail). Designed to run fully on-device on the Snapdragon® 8 Elite NPU.

📦 This repository hosts the deployable QNN binary package (SSUPER-AIGID.zip, ~2.6 GB).
💻 Training / quantization code: GITHUB:LPCV-SSUPER-POWER

Results

Metric	Value
Challenge score	0.72
Throughput	31.21 tokens/sec (≈2× the requirement)
Placement	🥈 2nd place
Target hardware	Snapdragon 8 Elite QRD (QNN)
Package size	~2.62 GB

Approach

A 3-stage pipeline:

Annotation — Qwen2.5-VL auto-labels ~788K images with domain tags and per-criterion forensic scores (8 criteria).
Training — Qwen2-VL-2B fine-tuned with LoRA+ via curriculum learning: free-form reasoning → structured template → JSON output.
Quantization — AIMET W4A16 quantization for both the vision encoder and language decoder, exported to QNN for the Snapdragon NPU.

Datasets

GenImage (ADM, BigGAN), SID-Set, SynthScars, ImageNet, COCO train2017, ARForensics.

Files

File	Description
`SSUPER-AIGID.zip`	QNN context binaries + embedding/position weights, tokenizer, and sample inputs for on-device inference.

After extraction the package contains the serialized model binaries (weight_sharing_model_*.serialized.bin, veg.serialized.bin), embedding_weights_151936x1536.raw, tokenizer.json, inputs.json, and the mask / position_ids tensors.

Team

SSUPER_POWER — VIP Lab, Soongsil University Dayoung Kil · Doeon Kim · Junyoon Lee

License

First-party code is released under the MIT License. The model is derived from Alibaba's Qwen2-VL-2B (Apache-2.0) and built with Qualcomm AIMET/QNN tooling and LLaMA-Factory (Apache-2.0); please review and comply with each upstream component's license.

Citation

@misc{ssuper-aigid-2026,
  title  = {SSUPER-AIGID: On-Device AI-Generated Image Detection},
  author = {Kil, Dayoung and Kim, Doeon and Lee, Junyoon},
  year   = {2026},
  note   = {2nd place, LPCVC 2026 Track 3 (ECV Workshop @ CVPR 2026)},
  url    = {https://github.com/LPCV-SSUPER-POWER/Track3-AI-Generated-Images-Detection}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

Image-Text-to-Text

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Dayoung-space/SSUPER-AIGID

Base model

Qwen/Qwen2-VL-2B

Finetuned

Qwen/Qwen2-VL-2B-Instruct