pp-nsfw_Inspector(中文版)

pp-nsfw_Inspector is an image content moderation pipeline running on the Axera NPU. It combines OCR, NSFW detection, QR code scanning, and keyword rule matching to classify images as PASS / REVIEW / REJECT.


Pipeline Overview

flowchart TD
    A[Input Image] --> B[Preprocess Layer<br/>Scene Classification + Image Processing + Long Image Slicing]
    B --> C{Process by Slice}

    C --> D[Image Branch<br/>NSFW + QR Code]
    C --> E{OCR Routing}

    E -->|SCREENSHOT| F[PP-OCRv5]
    E -->|DOCUMENT / POSTER / UNKNOWN| G[PP-DocLayout-S]
    G --> H[Text Region OCR]
    G --> I[Figure Region Extraction]
    I --> J[Figure Region NSFW]

    F --> K[OCR Result<br/>blocks + avg_score + text_state]
    H --> K
    D --> L[Image Signals<br/>nsfw / qr]
    J --> L

    K --> M[Understanding Layer<br/>Text Normalization + Strong/Weak Rules]
    M --> N[Text Signals<br/>rule_hits]
    L --> O[Decision Layer]
    K --> O
    N --> O

    O --> P{Final Action}
    P -->|PASS| Q[Release]
    P -->|REVIEW| R[Review]
    P -->|REJECT| S[Reject]

Supported Tasks

Layer Task Method
Preprocess Scene classification (rule-based) Screenshot / Document / Poster / Unknown
Perception Text recognition (OCR) PP-OCRv5 (det + cls + rec)
Perception Layout analysis PP-DocLayout-S
Perception NSFW detection ViT-based classifier
Perception QR code detection & domain filtering pyzbar + HTTP redirect expansion
Understanding Text normalization Traditional↔Simplified, full↔half-width, homophone map
Understanding Keyword rule matching pyahocorasick + google-re2
Decision Three-tier verdict PASS / REVIEW / REJECT

Model Details

All models are exported in w8a16 quantization for Axera NPU as .axmodel format. The following data is measured with ax_run_model -r 100 -w 10 (single-model benchmark, 100 iterations, 10 warmup).

Model Path Size (CMM) Latency (avg, NPU)
PP-OCRv5 Det axmodel/ppocrv5/det_npu1.axmodel 57.79 MiB 29.2 ms
PP-OCRv5 Cls axmodel/ppocrv5/cls_npu1.axmodel 0.62 MiB 0.3 ms
PP-OCRv5 Rec axmodel/ppocrv5/rec_npu1.axmodel 6.14 MiB 3.4 ms
PP-DocLayout-S axmodel/ppstructurev3/ppstructure_npu1.axmodel 58.29 MiB 8.8 ms
NSFW axmodel/nsfw/nsfw_npu1.axmodel 91.14 MiB 30.0 ms

Model conversion tools: Pulsar2 (ver 5.2+). Engine version: 2.10.1s.


Support Platform


How to Use

Python Environment

pyaxengine

# pyaxengine
wget https://github.com/AXERA-TECH/pyaxengine/releases/download/0.1.3.rc3/axengine-0.1.3-py3-none-any.whl
pip install axengine-0.1.3-py3-none-any.whl

# Other dependencies
pip install -r requirements.txt

Note: pyzbar requires the system zbar shared library.

Local Test (CLI)

python test.py

Iterates over images in images/ and prints the decision for each:

action:     REJECT
risk_level: high
labels:     ['nsfw']
score:      0.987
evidence:   [{'source': 'nsfw_model', 'score': 0.987}]

Web Demo

python app.py

Open http://127.0.0.1:5000/ in a browser. The web page supports:

  • Viewing a default sample image result
  • Uploading local images for moderation
  • Displaying decision, risk level, labels, evidence, slice details, OCR status, normalized text, and rule hits

Run in background:

setsid python app.py > web.log 2>&1 < /dev/null &
# Stop: pkill -f 'python app.py'
# Logs: tail -f web.log

Decision Semantics

The decision layer uses a tiered OR logic:

Strong Signals → REJECT

  • Strong keyword rule hit
  • QR code blacklist domain
  • NSFW score at or above scene-specific reject threshold

Weak Signals → REVIEW

  • Weak keyword rule hit
  • QR code unknown domain
  • NSFW score at or above review threshold
  • OCR low quality (avg_score < 0.65 and blocks < 3)
  • OCR missing expected text / missing uncertain

NSFW Thresholds by Scene

Scene review reject
SCREENSHOT 0.60 0.93
DOCUMENT 0.60 0.95
POSTER 0.60 0.85
UNKNOWN 0.60 0.90

Output Fields

Field Description
action Final verdict: PASS / REVIEW / REJECT
risk_level low / medium / high
primary_reason Top contributing factor for quick triage
labels All matched reasons (includes soft signals even on REJECT)
score Max confidence score across all signals
evidence All matched signal details (source, score, domains, etc.)

Limitations

  • Not a production service (no auth, no access control)
  • REVIEW strategy is conservative (cold-start phase)
  • Rules and thresholds are in early tuning stage
  • Designed for development, integration testing, and demonstrations

Other

  • PP-OCRv5 models: AXERA-TECH/PPOCR_v5
  • Test Images:All images under images/ are website screenshots used for internal regression testing.
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support