You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Dose-Response C4 (1M-1%): 1M subset preserving original proportion

Part of a dose-response experiment studying how unsafe training data fraction affects text-to-image model output safety.


Label	C4 (1M-1%)
Description	Random 1M subset preserving the original 1.21% unsafe proportion.
Training set size N	1.00M
Unsafe fraction p	1.21%
Unsafe count U	~12K


Iterations	100,000
Samples seen	~25.60M
Global batch size	256
Microbatch (per GPU)	32
Hardware	8× NVIDIA H200
Precision	bfloat16 (amp_bf16)
Optimizer (transformer blocks)	Muon (lr=1e-4, momentum=0.95, nesterov, ns_steps=5, weight_decay=0)
Optimizer (other params)	AdamW (lr=1e-4, β=(0.9, 0.95), eps=1e-8, weight_decay=0)
LR schedule	1,000-step linear warmup, constant after
EMA	decay 0.999, started at step 0
Random seed	42
Trainer	Composer + FSDP

The training set combines three image datasets, with per-condition filtering/oversampling:

Trained with the PRX framework (Composer + FSDP). The full config.yaml is included for reproducibility.

Finetunes