--- license: mit library_name: diffusers pipeline_tag: text-to-image tags: - diffusers - cafm - continuous-adversarial-flow-models - class-conditional - imagenet - text-to-image - z-image inference: true widget: - output: url: CAFM-JiT-H-16-256/demo.png language: - en --- # BiliSakura/CAFM-diffusers Self-contained [Continuous Adversarial Flow Models](https://arxiv.org/abs/2604.11521) checkpoints for Hugging Face diffusers. Converted from `ByteDance-Seed/Adversarial-Flow-Models` using `libs/AFM-diffusers/scripts/convert_cafm_to_diffusers.py`. Z-Image weights are bundled self-contained under `CAFM-Z-Image-T2I/`. ## Demo `CAFM-JiT-H-16-256` — class **207** (*golden retriever*), seed **0**, 100 NFE (Heun):

CAFM-JiT-H-16-256 demo (class 207, seed 0)

Each variant folder includes `demo.png` generated with the same prompt settings. ## Benchmark results (ImageNet 256×256) | Model | Space | NFE | FID | Checkpoint | | --- | --- | --- | --- | --- | | CAFM JiT-H/16 | pixel | 100 | 1.80 | `CAFM-JiT-H-16-256/` | | CAFM SiT-XL/2 | latent | 250 | 1.53 | `CAFM-SiT-XL-2-256/` | | CAFM Z-Image | latent T2I | 25 | — | `CAFM-Z-Image-T2I/` | ## Available checkpoints | Variant | Backbone | Steps | Solver | | --- | --- | ---: | --- | | `CAFM-JiT-H-16-256/` | JIT | 100 | heun | | `CAFM-SiT-XL-2-256/` | SIT | 250 | heun | | `CAFM-Z-Image-T2I/` | Z-IMAGE | 25 | euler | ## Inference ### ImageNet class-conditional (JiT / SiT) ```python from pathlib import Path import torch from diffusers import DiffusionPipeline model_dir = Path("./CAFM-SiT-XL-2-256") pipe = DiffusionPipeline.from_pretrained( str(model_dir), local_files_only=True, custom_pipeline=str(model_dir / "pipeline.py"), trust_remote_code=True, torch_dtype=torch.bfloat16, ).to("cuda") image = pipe(class_labels="golden retriever", num_inference_steps=250, sampler="heun").images[0] ``` ### Text-to-image (Z-Image) ```python from pathlib import Path import torch from diffusers import DiffusionPipeline model_dir = Path("./CAFM-Z-Image-T2I") pipe = DiffusionPipeline.from_pretrained( str(model_dir), local_files_only=True, custom_pipeline=str(model_dir / "pipeline.py"), trust_remote_code=True, torch_dtype=torch.bfloat16, ) pipe.enable_model_cpu_offload() # recommended for single-GPU inference image = pipe( prompt="A golden retriever sitting in a sunny park, photo realistic.", height=512, width=512, num_inference_steps=25, sampler="euler", ).images[0] ```