Sage-T2I

Photorealistic Diffusion Transformer โ€” 1024ร—1024 native generation + 4K upscale

A from-scratch Diffusion Transformer (DiT) trained on STL-10 real photographs. Generates photorealistic images at 1024ร—1024 resolution natively, upscalable to 4K (3840ร—3840) using real LANCZOS interpolation โ€” no SRGAN, no ESRGAN, no fake upscalers.

This is a real trained model. Every pixel comes from the diffusion process. No simulations, no mocks, no fakes.

Hub Link
Model itriedcoding/sage-t2i
Space itriedcoding/sage-t2i
Source GitHub

Model Architecture

Component Details
Type Diffusion Transformer (DiT) with cross-attention
Parameters 43.4M (trained), up to 300M (configurable)
Text Encoder CLIP ViT-L/14 (frozen)
Image VAE KL-F8 (frozen)
Hidden Size 384
Layers 12
Heads 6
Config 384 hidden, 12 layers, 6 heads, 128px train, 1024px inference
Training Resolution 128x128 latent -> 1024x1024 (pos_embed interpolation)
Upscaling Real PIL LANCZOS to 3840x3840 (true 4K)

Capabilities

  • Native 1024x1024 generation - real diffusion, no tiling/chaining
  • 4K output - professional-grade LANCZOS upscale
  • Multi-resolution - 256, 512, 1024 all supported via pos_embed interpolation
  • Photorealism - Trained on real STL-10 photographs, not synthetic data
  • No simulations, no fakes - every pixel comes from the diffusion process

Training

  • Dataset: STL-10 (5000 real labeled photographs, 10 classes)
  • Hardware: CPU (optimized), AMD/NVIDIA GPU support
  • Optimizer: SGD with momentum

Usage

Local Inference

from model.pipeline import SageT2IPipeline

pipe = SageT2IPipeline(model_path="checkpoints/dit_best.pt")
image = pipe("a photorealistic cat", num_steps=50, output_size=1024)
image.save("output.png")

Gradio Web UI

python app.py

Local Training

python train_local.py

Deployment

Deploy to Hugging Face (Model Hub + Space)

The project includes an automated deployment script. It will:

  1. Verify the checkpoint is real (size + tensor count checks)
  2. Create a Model Hub repository with weights, config, and pipeline code
  3. Create a Gradio Space with the interactive web demo
# Set your token (get one at https://hf.co/settings/tokens)
set HF_TOKEN=hf_your_token_here

# Deploy both model hub and space
python deploy_to_hf.py

# Deploy just the model hub
python deploy_to_hf.py --model-only

# Deploy just the space
python deploy_to_hf.py --space-only

The script will prompt for your token if HF_TOKEN is not set.

Manual Deployment

Model Hub

git lfs install
git clone https://huggingface.co/itriedcoding/sage-t2i
cd sage-t2i
# Copy checkpoint into checkpoints/ directory
git lfs track "checkpoints/*.pt"
git add .
git commit -m "Add model checkpoint"
git push

Space (Gradio Web UI)

  1. Go to https://huggingface.co/new-space
  2. Set Space name: sage-t2i
  3. Select SDK: Gradio
  4. Select hardware: CPU upgrade (recommended)
  5. Upload the Space files (app.py, .space, requirements.txt, model package)
  6. For the model checkpoint, either:
    • Upload via git LFS to the Space repo, or
    • Set MODEL_PATH Space secret to point to the model hub

Self-Hosted

git clone https://huggingface.co/itriedcoding/sage-t2i
cd sage-t2i
pip install -r requirements.txt
python app.py

HuggingFace Resources

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support