TwoHamsters / README.md
chaoshuo's picture
Update README.md
1e01253 verified
|
Raw
History Blame Contribute Delete
2.89 kB
---
language:
- en
pipeline_tag: other
tags:
- image-classification
- vision-language
- safety
- multimodal-safety
- concept-detection
- twohamsters
- mccu
datasets:
- TwoHamsters
---
# MCCU-ViT
MCCU-ViT is a multi-head visual classifier for auditing whether generated images contain unsafe multi-concept combinations from the MCCU/TwoHamsters benchmark. The model uses an OpenCLIP `ViT-L-14-CLIPA-336` visual backbone pretrained on `datacomp1b`, followed by three lightweight classification heads:
- `Concept_1` head
- `Concept_2` head
- harm-category head
The classifier is intended for post-hoc evaluation of text-to-image or image-conditioned generation outputs. Given an image, it predicts the most likely first concept, second concept, and harm category. A final `Safe/Other` label is appended to each prediction head for images outside the known unsafe concept or category set.
## Files
This repository contains:
| File | Description |
|---|---|
| `backbone.pt` | Fine-tuned OpenCLIP visual-backbone weights. |
| `vit_multi_task_heads.pt` | Multi-task classification heads: `head_c1`, `head_c2`, and `head_cls`. |
| `TwoHamsters_text-only_17.5k.csv` | CSV used to recover concept/category label vocabularies and provide benchmark prompts. |
## Intended Use
MCCU-ViT is designed for research use in multimodal safety evaluation, especially:
- detecting whether a generated image contains a target unsafe concept pair;
- computing attack or erasure metrics such as UAR/MDR/CMDR;
- auditing generated images from text-to-image, image-editing, and image-fusion systems.
## Ethical Considerations
This model is released to support research on T2I generation safety and robustness. It should not be used to generate harmful content, target individuals or groups, or make high-stakes moderation decisions without human review and additional validation.
## Quick Start
Install dependencies:
```bash
pip install torch pillow pandas open_clip_torch huggingface_hub
```
Run inference:
```python
from PIL import Image
from example_inference import MCCUViT
model = MCCUViT.from_pretrained("chaoshuo/TwoHamsters", device="cuda")
pred = model.predict(Image.open("example.png"))
print(pred)
```
Output format:
```python
{
"concept_1": "...",
"concept_2": "...",
"category": "...",
"concept_1_confidence": 0.0,
"concept_2_confidence": 0.0,
"category_confidence": 0.0,
}
```
## Citation
If you use MCCU-ViT or the MCCU/TwoHamsters benchmark, please cite the corresponding paper or project release.
~~~bibtex
@article{zhang2026twohamsters,
title={TwoHamsters: Benchmarking Multi-Concept Compositional Unsafety in Text-to-Image Models},
author={Zhang, Chaoshuo and Liang, Yibo and Tian, Mengke and Lin, Chenhao and Zhao, Zhengyu and Yang, Le and Zhang, Chong and Zhang, Yang and Shen, Chao},
journal={arXiv preprint arXiv:2604.15967},
year={2026}
}
~~~