Simple Latent Diffusion Model (LDM)

This repository contains the pre-trained weights and configuration files for the Simple Latent Diffusion Model project.

For the full source code, detailed explanations, and implementation logic, please visit the original GitHub repository.

🚀 Model Description

This project implements a Latent Diffusion Model (LDM) from scratch. The repository includes:

Custom-trained VAE: For compressing images into a latent space.
Diffusion Model: A U-Net based architecture for the reverse diffusion process.
CLIP Weights: Integrated for text-guided image generation.

The repository provides weights for three different datasets, covering both unconditional and conditional generation tasks:

Dataset	Type	Description
CIFAR-10	Unconditional	32x32 image generation based on CIFAR-10 classes.
CelebA	Unconditional	Human face generation trained on the CelebA dataset.
Asian Composite	Text-to-Image (T2I)	CLIP-based conditional generation using the Asian Composite Dataset.

If you want to experiment with these models and generate your own images, we provide a hands-on example notebook.

Open the cifar10_example.ipynb file provided in this repository.
Follow the step-by-step instructions to load the configurations and model weights.
Run the cells to start the sampling process and generate images.

GitHub Repository: Won-Seong/simple-latent-diffusion-model
Contact: For detailed code logic or issues, please refer to the GitHub documentation or open an issue there.

Note: Ensure you have the necessary dependencies installed as specified in the GitHub repository's requirements.

Downloads last month: -; Downloads are not tracked for this model. How to track