Instructions to use BiliSakura/NiT-diffusers with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use BiliSakura/NiT-diffusers with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("BiliSakura/NiT-diffusers", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
| license: apache-2.0 | |
| language: | |
| - en | |
| library_name: diffusers | |
| tags: | |
| - diffusers | |
| - image-generation | |
| - class-conditional | |
| - nit | |
| pipeline_tag: unconditional-image-generation | |
| widget: | |
| - output: | |
| url: demo.png | |
| # NiT-diffusers | |
| Native diffusers implementation of **NiT** (Native-resolution Image Transformer). Each variant folder is self-contained: | |
| - `pipeline.py` β `NiTPipeline` | |
| - `scheduler/scheduler_config.json` β `FlowMatchEulerDiscreteScheduler` config (class ships with Diffusers) | |
| - `transformer/nit_transformer_2d.py` β `NiTTransformer2DModel` | |
| - `vae/` β `AutoencoderDC` weights + config | |
| No separate `NiT-diffusers` package at inference time; only PyPI `diffusers` plus local custom code in the variant directory. | |
| ## Available checkpoints | |
| | Checkpoint | Path | Resolution | Recommended settings | | |
| | --- | --- | --- | --- | | |
| | NiT-S | `./NiT-S` | 256Γ256 | 250 steps, CFG 2.25, interval (0.0, 0.7) | | |
| | NiT-B | `./NiT-B` | 256Γ256 | 250 steps, CFG 2.25, interval (0.0, 0.7) | | |
| | NiT-L | `./NiT-L` | 512Γ512 | 250 steps, CFG 2.05, interval (0.0, 0.7) | | |
| | NiT-XL | `./NiT-XL` | 512Γ512 | 250 steps, CFG 2.05, interval (0.0, 0.7) | | |
| ## ImageNet class labels | |
| Each variant keeps an English `id2label` map directly in its own `model_index.json` (DiT-style). | |
| - `pipe.id2label` β inspect id β English label correspondence | |
| - `pipe.labels` β reverse map (English synonym β id), sorted for browsing | |
| - `pipe.get_label_ids("golden retriever")` | |
| - `pipe(class_labels="golden retriever", ...)` β string labels resolved automatically | |
| ## Inference | |
| Run the bundled demo script from the repo root: | |
| ```bash | |
| python demo_inference.py | |
| ``` | |
| This writes `demo.png` using `NiT-XL` with the settings below. | |
| ```python | |
| from pathlib import Path | |
| import torch | |
| from diffusers import DiffusionPipeline | |
| model_dir = Path("./NiT-XL").resolve() | |
| pipe = DiffusionPipeline.from_pretrained( | |
| str(model_dir), | |
| local_files_only=True, | |
| custom_pipeline=str(model_dir / "pipeline.py"), | |
| trust_remote_code=True, | |
| torch_dtype=torch.bfloat16, | |
| ) | |
| pipe.to("cuda") | |
| print(pipe.id2label[207]) | |
| print(pipe.get_label_ids("golden retriever")) | |
| generator = torch.Generator(device="cuda").manual_seed(42) | |
| image = pipe( | |
| class_labels="golden retriever", | |
| height=512, | |
| width=512, | |
| num_inference_steps=250, | |
| guidance_scale=2.05, | |
| guidance_interval=(0.0, 0.7), | |
| generator=generator, | |
| ).images[0] | |
| image.save("demo.png") | |
| ``` | |
| Load a **variant subfolder** (e.g. `./NiT-XL`, `./NiT-L`, `./NiT-B`, or `./NiT-S`), not the repo root. | |
| For NiT-S / NiT-B at 256Γ256 (official defaults): | |
| ```python | |
| model_dir = Path("./NiT-S").resolve() # or ./NiT-B | |
| pipe = DiffusionPipeline.from_pretrained( | |
| str(model_dir), | |
| local_files_only=True, | |
| custom_pipeline=str(model_dir / "pipeline.py"), | |
| trust_remote_code=True, | |
| torch_dtype=torch.bfloat16, | |
| ) | |
| pipe.to("cuda") | |
| image = pipe( | |
| class_labels="golden retriever", | |
| height=256, | |
| width=256, | |
| num_inference_steps=250, | |
| guidance_scale=2.25, | |
| guidance_interval=(0.0, 0.7), | |
| generator=torch.Generator(device="cuda").manual_seed(42), | |
| ).images[0] | |
| ``` | |
| Hub usage follows Hugging Face model-id style (`UserID/RepoID`): | |
| ```python | |
| from diffusers import DiffusionPipeline | |
| import torch | |
| pipe = DiffusionPipeline.from_pretrained( | |
| "BiliSakura/NiT-diffusers", | |
| subfolder="NiT-XL", | |
| custom_pipeline="pipeline.py", | |
| trust_remote_code=True, | |
| torch_dtype=torch.bfloat16, | |
| ) | |
| pipe.to("cuda") | |
| ``` | |
| ## Citation | |
| ```bibtex | |
| @article{wang2025native, | |
| title={Native-Resolution Image Synthesis}, | |
| author={Wang, Zidong and Bai, Lei and Yue, Xiangyu and Ouyang, Wanli and Zhang, Yiyuan}, | |
| year={2025}, | |
| eprint={2506.03131}, | |
| archivePrefix={arXiv}, | |
| primaryClass={cs.CV} | |
| } | |
| ``` | |