Sage-T2I

Photorealistic Diffusion Transformer — 1024×1024 native generation + 4K upscale

A from-scratch Diffusion Transformer (DiT) trained on STL-10 real photographs. Generates photorealistic images at 1024×1024 resolution natively, upscalable to 4K (3840×3840) using real LANCZOS interpolation — no SRGAN, no ESRGAN, no fake upscalers.

This is a real trained model. Every pixel comes from the diffusion process. No simulations, no mocks, no fakes.

Hub	Link
Model	itriedcoding/sage-t2i
Space	itriedcoding/sage-t2i
Source	GitHub

Model Architecture

Component	Details
Type	Diffusion Transformer (DiT) with cross-attention
Parameters	43.4M (trained), up to 300M (configurable)
Text Encoder	CLIP ViT-L/14 (frozen)
Image VAE	KL-F8 (frozen)
Hidden Size	384
Layers	12
Heads	6
Config	384 hidden, 12 layers, 6 heads, 128px train, 1024px inference
Training Resolution	128x128 latent -> 1024x1024 (pos_embed interpolation)
Upscaling	Real PIL LANCZOS to 3840x3840 (true 4K)

Capabilities

Native 1024x1024 generation - real diffusion, no tiling/chaining
4K output - professional-grade LANCZOS upscale
Multi-resolution - 256, 512, 1024 all supported via pos_embed interpolation
Photorealism - Trained on real STL-10 photographs, not synthetic data
No simulations, no fakes - every pixel comes from the diffusion process

Training

Dataset: STL-10 (5000 real labeled photographs, 10 classes)
Hardware: CPU (optimized), AMD/NVIDIA GPU support
Optimizer: SGD with momentum

Usage

Local Inference

from model.pipeline import SageT2IPipeline

pipe = SageT2IPipeline(model_path="checkpoints/dit_best.pt")
image = pipe("a photorealistic cat", num_steps=50, output_size=1024)
image.save("output.png")

Gradio Web UI

python app.py

Local Training

python train_local.py

Deployment

Deploy to Hugging Face (Model Hub + Space)

The project includes an automated deployment script. It will:

Verify the checkpoint is real (size + tensor count checks)
Create a Model Hub repository with weights, config, and pipeline code
Create a Gradio Space with the interactive web demo

# Set your token (get one at https://hf.co/settings/tokens)
set HF_TOKEN=hf_your_token_here

# Deploy both model hub and space
python deploy_to_hf.py

# Deploy just the model hub
python deploy_to_hf.py --model-only

# Deploy just the space
python deploy_to_hf.py --space-only

The script will prompt for your token if HF_TOKEN is not set.

Manual Deployment

Model Hub

git lfs install
git clone https://huggingface.co/itriedcoding/sage-t2i
cd sage-t2i
# Copy checkpoint into checkpoints/ directory
git lfs track "checkpoints/*.pt"
git add .
git commit -m "Add model checkpoint"
git push

Space (Gradio Web UI)

Go to https://huggingface.co/new-space
Set Space name: sage-t2i
Select SDK: Gradio
Select hardware: CPU upgrade (recommended)
Upload the Space files (app.py, .space, requirements.txt, model package)
For the model checkpoint, either:
- Upload via git LFS to the Space repo, or
- Set MODEL_PATH Space secret to point to the model hub

Self-Hosted

git clone https://huggingface.co/itriedcoding/sage-t2i
cd sage-t2i
pip install -r requirements.txt
python app.py

HuggingFace Resources

Model Hub: https://huggingface.co/itriedcoding/sage-t2i
Gradio Space: https://huggingface.co/spaces/itriedcoding/sage-t2i
Duplicate Space: https://huggingface.co/spaces/itriedcoding/sage-t2i?duplicate=true

Downloads last month: -; Downloads are not tracked for this model. How to track