Simple Latent Diffusion Model (LDM)
This repository contains the pre-trained weights and configuration files for the Simple Latent Diffusion Model project.
For the full source code, detailed explanations, and implementation logic, please visit the original GitHub repository.
π Model Description
This project implements a Latent Diffusion Model (LDM) from scratch. The repository includes:
- Custom-trained VAE: For compressing images into a latent space.
- Diffusion Model: A U-Net based architecture for the reverse diffusion process.
- CLIP Weights: Integrated for text-guided image generation.
π Available Models & Checkpoints
The repository provides weights for three different datasets, covering both unconditional and conditional generation tasks:
| Dataset | Type | Description |
|---|---|---|
| CIFAR-10 | Unconditional | 32x32 image generation based on CIFAR-10 classes. |
| CelebA | Unconditional | Human face generation trained on the CelebA dataset. |
| Asian Composite | Text-to-Image (T2I) | CLIP-based conditional generation using the Asian Composite Dataset. |
π How to Use
If you want to experiment with these models and generate your own images, we provide a hands-on example notebook.
- Open the
cifar10_example.ipynbfile provided in this repository. - Follow the step-by-step instructions to load the configurations and model weights.
- Run the cells to start the sampling process and generate images.
π References
- GitHub Repository: Won-Seong/simple-latent-diffusion-model
- Contact: For detailed code logic or issues, please refer to the GitHub documentation or open an issue there.
Note: Ensure you have the necessary dependencies installed as specified in the GitHub repository's requirements.