Text-to-Image
Diffusers
Safetensors
English
cafm
continuous-adversarial-flow-models
class-conditional
imagenet
z-image
Instructions to use BiliSakura/CAFM-diffusers with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use BiliSakura/CAFM-diffusers with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("BiliSakura/CAFM-diffusers", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Draw Things
- DiffusionBee
File size: 2,591 Bytes
b897fe7 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 | ---
license: mit
library_name: diffusers
pipeline_tag: text-to-image
tags:
- diffusers
- cafm
- continuous-adversarial-flow-models
- class-conditional
- imagenet
- text-to-image
- z-image
inference: true
widget:
- output:
url: CAFM-JiT-H-16-256/demo.png
language:
- en
---
# BiliSakura/CAFM-diffusers
Self-contained [Continuous Adversarial Flow Models](https://arxiv.org/abs/2604.11521) checkpoints for Hugging Face diffusers.
Converted from `ByteDance-Seed/Adversarial-Flow-Models` using `libs/AFM-diffusers/scripts/convert_cafm_to_diffusers.py`.
Z-Image weights are bundled self-contained under `CAFM-Z-Image-T2I/`.
## Demo
`CAFM-JiT-H-16-256` — class **207** (*golden retriever*), seed **0**, 100 NFE (Heun):
<p align="center">
<img src="CAFM-JiT-H-16-256/demo.png" alt="CAFM-JiT-H-16-256 demo (class 207, seed 0)" width="256"/>
</p>
Each variant folder includes `demo.png` generated with the same prompt settings.
## Benchmark results (ImageNet 256×256)
| Model | Space | NFE | FID | Checkpoint |
| --- | --- | --- | --- | --- |
| CAFM JiT-H/16 | pixel | 100 | 1.80 | `CAFM-JiT-H-16-256/` |
| CAFM SiT-XL/2 | latent | 250 | 1.53 | `CAFM-SiT-XL-2-256/` |
| CAFM Z-Image | latent T2I | 25 | — | `CAFM-Z-Image-T2I/` |
## Available checkpoints
| Variant | Backbone | Steps | Solver |
| --- | --- | ---: | --- |
| `CAFM-JiT-H-16-256/` | JIT | 100 | heun |
| `CAFM-SiT-XL-2-256/` | SIT | 250 | heun |
| `CAFM-Z-Image-T2I/` | Z-IMAGE | 25 | euler |
## Inference
### ImageNet class-conditional (JiT / SiT)
```python
from pathlib import Path
import torch
from diffusers import DiffusionPipeline
model_dir = Path("./CAFM-SiT-XL-2-256")
pipe = DiffusionPipeline.from_pretrained(
str(model_dir),
local_files_only=True,
custom_pipeline=str(model_dir / "pipeline.py"),
trust_remote_code=True,
torch_dtype=torch.bfloat16,
).to("cuda")
image = pipe(class_labels="golden retriever", num_inference_steps=250, sampler="heun").images[0]
```
### Text-to-image (Z-Image)
```python
from pathlib import Path
import torch
from diffusers import DiffusionPipeline
model_dir = Path("./CAFM-Z-Image-T2I")
pipe = DiffusionPipeline.from_pretrained(
str(model_dir),
local_files_only=True,
custom_pipeline=str(model_dir / "pipeline.py"),
trust_remote_code=True,
torch_dtype=torch.bfloat16,
)
pipe.enable_model_cpu_offload() # recommended for single-GPU inference
image = pipe(
prompt="A golden retriever sitting in a sunny park, photo realistic.",
height=512,
width=512,
num_inference_steps=25,
sampler="euler",
).images[0]
```
|