Text-to-Image
Diffusers
Safetensors
English
cafm
continuous-adversarial-flow-models
class-conditional
imagenet
z-image
Instructions to use BiliSakura/CAFM-diffusers with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use BiliSakura/CAFM-diffusers with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("BiliSakura/CAFM-diffusers", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Draw Things
- DiffusionBee
| license: mit | |
| library_name: diffusers | |
| pipeline_tag: text-to-image | |
| tags: | |
| - diffusers | |
| - cafm | |
| - continuous-adversarial-flow-models | |
| - class-conditional | |
| - imagenet | |
| - text-to-image | |
| - z-image | |
| inference: true | |
| widget: | |
| - output: | |
| url: CAFM-JiT-H-16-256/demo.png | |
| language: | |
| - en | |
| # BiliSakura/CAFM-diffusers | |
| Self-contained [Continuous Adversarial Flow Models](https://arxiv.org/abs/2604.11521) checkpoints for Hugging Face diffusers. | |
| Converted from `ByteDance-Seed/Adversarial-Flow-Models` using `libs/AFM-diffusers/scripts/convert_cafm_to_diffusers.py`. | |
| Z-Image weights are bundled self-contained under `CAFM-Z-Image-T2I/`. | |
| ## Demo | |
| `CAFM-JiT-H-16-256` — class **207** (*golden retriever*), seed **0**, 100 NFE (Heun): | |
| <p align="center"> | |
| <img src="CAFM-JiT-H-16-256/demo.png" alt="CAFM-JiT-H-16-256 demo (class 207, seed 0)" width="256"/> | |
| </p> | |
| Each variant folder includes `demo.png` generated with the same prompt settings. | |
| ## Benchmark results (ImageNet 256×256) | |
| | Model | Space | NFE | FID | Checkpoint | | |
| | --- | --- | --- | --- | --- | | |
| | CAFM JiT-H/16 | pixel | 100 | 1.80 | `CAFM-JiT-H-16-256/` | | |
| | CAFM SiT-XL/2 | latent | 250 | 1.53 | `CAFM-SiT-XL-2-256/` | | |
| | CAFM Z-Image | latent T2I | 25 | — | `CAFM-Z-Image-T2I/` | | |
| ## Available checkpoints | |
| | Variant | Backbone | Steps | Solver | | |
| | --- | --- | ---: | --- | | |
| | `CAFM-JiT-H-16-256/` | JIT | 100 | heun | | |
| | `CAFM-SiT-XL-2-256/` | SIT | 250 | heun | | |
| | `CAFM-Z-Image-T2I/` | Z-IMAGE | 25 | euler | | |
| ## Inference | |
| ### ImageNet class-conditional (JiT / SiT) | |
| ```python | |
| from pathlib import Path | |
| import torch | |
| from diffusers import DiffusionPipeline | |
| model_dir = Path("./CAFM-SiT-XL-2-256") | |
| pipe = DiffusionPipeline.from_pretrained( | |
| str(model_dir), | |
| local_files_only=True, | |
| custom_pipeline=str(model_dir / "pipeline.py"), | |
| trust_remote_code=True, | |
| torch_dtype=torch.bfloat16, | |
| ).to("cuda") | |
| image = pipe(class_labels="golden retriever", num_inference_steps=250, sampler="heun").images[0] | |
| ``` | |
| ### Text-to-image (Z-Image) | |
| ```python | |
| from pathlib import Path | |
| import torch | |
| from diffusers import DiffusionPipeline | |
| model_dir = Path("./CAFM-Z-Image-T2I") | |
| pipe = DiffusionPipeline.from_pretrained( | |
| str(model_dir), | |
| local_files_only=True, | |
| custom_pipeline=str(model_dir / "pipeline.py"), | |
| trust_remote_code=True, | |
| torch_dtype=torch.bfloat16, | |
| ) | |
| pipe.enable_model_cpu_offload() # recommended for single-GPU inference | |
| image = pipe( | |
| prompt="A golden retriever sitting in a sunny park, photo realistic.", | |
| height=512, | |
| width=512, | |
| num_inference_steps=25, | |
| sampler="euler", | |
| ).images[0] | |
| ``` | |