TwoHamsters / README.md
chaoshuo's picture
Update README.md
1e01253 verified
|
raw
history blame contribute delete
2.89 kB
metadata
language:
  - en
pipeline_tag: other
tags:
  - image-classification
  - vision-language
  - safety
  - multimodal-safety
  - concept-detection
  - twohamsters
  - mccu
datasets:
  - TwoHamsters

MCCU-ViT

MCCU-ViT is a multi-head visual classifier for auditing whether generated images contain unsafe multi-concept combinations from the MCCU/TwoHamsters benchmark. The model uses an OpenCLIP ViT-L-14-CLIPA-336 visual backbone pretrained on datacomp1b, followed by three lightweight classification heads:

  • Concept_1 head
  • Concept_2 head
  • harm-category head

The classifier is intended for post-hoc evaluation of text-to-image or image-conditioned generation outputs. Given an image, it predicts the most likely first concept, second concept, and harm category. A final Safe/Other label is appended to each prediction head for images outside the known unsafe concept or category set.

Files

This repository contains:

File Description
backbone.pt Fine-tuned OpenCLIP visual-backbone weights.
vit_multi_task_heads.pt Multi-task classification heads: head_c1, head_c2, and head_cls.
TwoHamsters_text-only_17.5k.csv CSV used to recover concept/category label vocabularies and provide benchmark prompts.

Intended Use

MCCU-ViT is designed for research use in multimodal safety evaluation, especially:

  • detecting whether a generated image contains a target unsafe concept pair;
  • computing attack or erasure metrics such as UAR/MDR/CMDR;
  • auditing generated images from text-to-image, image-editing, and image-fusion systems.

Ethical Considerations

This model is released to support research on T2I generation safety and robustness. It should not be used to generate harmful content, target individuals or groups, or make high-stakes moderation decisions without human review and additional validation.

Quick Start

Install dependencies:

pip install torch pillow pandas open_clip_torch huggingface_hub

Run inference:

from PIL import Image
from example_inference import MCCUViT

model = MCCUViT.from_pretrained("chaoshuo/TwoHamsters", device="cuda")
pred = model.predict(Image.open("example.png"))
print(pred)

Output format:

{
    "concept_1": "...",
    "concept_2": "...",
    "category": "...",
    "concept_1_confidence": 0.0,
    "concept_2_confidence": 0.0,
    "category_confidence": 0.0,
}

Citation

If you use MCCU-ViT or the MCCU/TwoHamsters benchmark, please cite the corresponding paper or project release.

@article{zhang2026twohamsters,
  title={TwoHamsters: Benchmarking Multi-Concept Compositional Unsafety in Text-to-Image Models},
  author={Zhang, Chaoshuo and Liang, Yibo and Tian, Mengke and Lin, Chenhao and Zhao, Zhengyu and Yang, Le and Zhang, Chong and Zhang, Yang and Shen, Chao},
  journal={arXiv preprint arXiv:2604.15967},
  year={2026}
}