open3dforge / PLAN.md
Reverb's picture
Milestone 2: Wire TRELLIS.2 generation + vendor both model packages
1ae114b

Open3DForge β€” Full Build Plan

Project: Personal image-to-game-ready 3D asset pipeline Owner: Basel Β· Solo dev for "What Remains" (UE5.7) Hosting: Single HF Pro Space, ZeroGPU H200, 25 min/day quota SDK: Gradio 6.x (currently 6.14.0) Repo: Reverb/open3dforge


Architectural Decisions (Locked In)

These were debated and decided during early development. Don't relitigate them mid-build.

  1. One Space, not multiple. Option B: vendor TRELLIS.2 + Hunyuan3D-2 + UniRig into this Space rather than orchestrating across multiple Spaces via gradio_client. Larger repo, but single deployment, no inter-Space latency, no auth juggling.

  2. HF Space only, no local fallback. Don't build a path for running on the RTX 4070. Quota is enough for personal use.

  3. Gradio 6.x. Match the sdk_version in README.md. No upper bound in requirements.txt.

  4. UE5-first export defaults. DirectX normals, ORM packing, cm units, Z-up, SM_/SK_/T_ naming.

  5. Drop the custom website. Standard Gradio tabs. gradio.Server is not worth the work for one user.

  6. Drop CHORD. Research-only license. Use TRELLIS.2's own metallic/roughness volume attributes instead, which are already correct and license-compatible.

  7. nvdiffrast for all baking. Not Blender headless. Fast (~2-5s per bake), GPU-based, fits inside @spaces.GPU, no apt install needed beyond what TRELLIS already requires.

  8. Solo-user workspace pattern. One workspace/current/ folder. No session IDs, no multi-tenancy.


Reference Code: Pipeline Stage Order

INPUT: 1-4 images
   β”‚
   β”œβ”€β”€ rembg (background removal, CPU)
   β”‚
   β–Ό
[Stage 1] GENERATION                          [GPU]
   β”œβ”€β”€ TRELLIS.2 (hard surface) or Hunyuan3D-2 (organic)
   β”œβ”€β”€ SAVE high_poly.glb (kept for normal baking)
   └── Extract: albedo + metallic + roughness volume attrs
   β”‚
   β–Ό
[Stage 2] POST-PROCESSING                     [CPU + GPU baking]
   2A  Mesh repair             pymeshfix          CPU
   2B  Geometry cleanup        PyMeshLab          CPU
   2C  Decimation                                 CPU
        β”œβ”€β”€ Preview: fast-simplification
        └── Final:   PyMeshLab quality pass
   2D  Symmetry (characters)   PyMeshLab          CPU
   2E  UV unwrap               xatlas             CPU
        (texels_per_unit packing)
   2F  Normal bake             nvdiffrast         GPU
        DX + GL outputs
   2G  Albedo bake             nvdiffrast         GPU
        vertex color β†’ UV atlas
   2H  Material bake           nvdiffrast         GPU
        TRELLIS volume attrs β†’ UV
   2I  AO bake                 nvdiffrast         GPU
        ray-occlusion β†’ UV
   2J  Texture inpaint (opt)   SDXL inpaint       GPU
        hidden UV regions
   2K  Channel pack            numpy              CPU
        Unreal ORM / Unity MetSmooth
   2L  LOD generation          PyMeshLab          CPU
        LOD0/LOD1/LOD2 only (UE5 HLOD handles billboards)
   2M  Collision mesh          CoACD              CPU
   2N  Pivot correction        trimesh            CPU
   2O  Scale validation        trimesh            CPU
   β”‚
   β–Ό
[Stage 3] AUTO-RIGGING (Optional)             [GPU]
   UniRig β†’ rigged.glb / rigged.fbx
   β”‚
   β–Ό
[Stage 4] EXPORT
   UE5 preset (default) β†’ DX normals + ORM
   Naming: SM_/SK_/T_ convention
   β†’ export_AssetName_UE5.zip

Milestone Plan

Each milestone is sized to be one focused work session. Push at the end of each, verify, then move on.


βœ… Milestone 1 β€” Foundation (COMPLETE)

Status: Deployed and verified. ZeroGPU smoke test passes.

What got built:

  • HF Space scaffolded with Gradio 6 + ZeroGPU
  • 5 tabs: Generate / Post-Process / Auto-Rig / Export / Presets + Diagnostics
  • workspace/ folder pattern (current/exports/presets/history)
  • src/workspace.py β€” AssetState, preset save/load
  • src/quota.py β€” daily quota tracking
  • src/ui_helpers.py β€” status bar, asset summary, viewer model picker
  • gr.Model3D viewer wired up
  • Pipeline stubs returning placeholder messages
  • Diagnostics tab with @spaces.GPU smoke test

Files in repo:

open3dforge/
β”œβ”€β”€ README.md             ← sdk_version: 6.14.0
β”œβ”€β”€ requirements.txt      ← gradio>=5.0, spaces, numpy, pillow
β”œβ”€β”€ .gitignore
β”œβ”€β”€ app.py                ← main entry, all UI wiring
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ workspace.py
β”‚   β”œβ”€β”€ quota.py
β”‚   └── ui_helpers.py
└── workspace/
    β”œβ”€β”€ current/.gitkeep
    β”œβ”€β”€ exports/.gitkeep
    β”œβ”€β”€ presets/.gitkeep
    └── history/.gitkeep

🟑 Milestone 2 β€” Stage 1: TRELLIS.2 Generation (NEXT)

Goal: Real image-to-3D generation working end-to-end. Upload image β†’ get a GLB in the viewer.

Approach: Option B β€” duplicate the microsoft/TRELLIS.2 Space, merge its contents into our repo, then refactor app.py to integrate with our tab structure.

Step-by-step:

  1. Duplicate microsoft/TRELLIS.2 Space to get a known-good baseline:

    • On HF: huggingface.co/spaces/microsoft/TRELLIS.2 β†’ Duplicate this Space β†’ name it open3dforge-trellis-staging
    • This is a staging copy β€” we don't deploy it, we just clone it locally for the merge
    • Confirm it builds and runs in your duplicate before touching anything
  2. Clone both repos locally:

    git clone https://huggingface.co/spaces/Reverb/open3dforge
    git clone https://huggingface.co/spaces/baselanaya/open3dforge-trellis-staging
    
  3. Copy TRELLIS.2 assets into open3dforge:

    cp -r open3dforge-trellis-staging/trellis2/ open3dforge/
    cp -r open3dforge-trellis-staging/assets/ open3dforge/
    cp open3dforge-trellis-staging/autotune_cache.json open3dforge/
    cp open3dforge-trellis-staging/packages.txt open3dforge/
    

    This gives us the vendored trellis2/ Python package, HDRI envmaps, FlexGemm cache, and apt deps.

  4. Merge requirements.txt: Combine the TRELLIS.2 requirements with our existing ones. Add to requirements.txt:

    # TRELLIS.2 deps (from microsoft/TRELLIS.2 Space)
    torch
    torchvision
    cv2 / opencv-python-headless
    imageio
    imageio-ffmpeg
    rembg
    # plus the custom wheels they install at build time
    

    Copy theirs verbatim and add to ours. Inspect the resolved requirements.txt in the staging duplicate first.

  5. Refactor app.py to integrate the TRELLIS handlers:

    • Move TRELLIS pipeline init to module level (per ZeroGPU rules β€” must be on CUDA at module-level)
    • Wrap their image_to_3d + extract_glb functions as the implementation of our existing stub_generate handler
    • Update the Generate tab to match TRELLIS parameter names (resolution, ss_sampling_steps, etc.)
    • Hide most TRELLIS knobs behind the "Advanced" accordion; expose only Quality preset + Seed at top level
    • Keep our quality presets (Fast/Balanced/Hero) mapping to their parameter sets
    • Hook the output GLB into workspace.get_state().raw_gen_glb and save the high-poly separately
  6. Critical: save high_poly.glb before decimation. TRELLIS's extract_glb calls o_voxel.postprocess.to_glb(decimation_target=...). We need to call it once with no decimation (or a very high target like 16M faces β€” the nvdiffrast limit they use) to get the high-poly we'll bake from in Stage 2, then call it again with the user's chosen decimation_target for the working low-poly.

  7. Update workspace state on success:

    state = workspace.get_state()
    state.high_poly_glb = Path("workspace/current/high_poly.glb")
    state.raw_gen_glb = Path("workspace/current/raw_gen.glb")
    state.face_count = len(mesh.faces)
    state.vertex_count = len(mesh.vertices)
    state.model_used = "TRELLIS.2"
    
  8. Test:

    • Push to Space
    • Wait for build (~10-15 min due to CUDA wheels compiling)
    • Upload a test image
    • Confirm the GLB appears in the viewer
    • Confirm Diagnostics quota tracker shows time consumed

Quality presets to wire up (map to TRELLIS params):

Preset resolution ss_steps shape_steps tex_steps Expected time
Fast 512 8 8 8 ~30s
Balanced 1024 12 12 12 ~60s
Hero 1536 16 16 16 ~90s

Risk mitigation:

  • TRELLIS.2 build can fail in many ways (CUDA wheel compilation, flash-attn install). If a build fails, check the build logs for which wheel failed. The staging duplicate is the reference β€” if it built there, the issue is in your merge.
  • Don't move anything into @spaces.GPU functions that should be at module level. Pipeline init goes at module level.

Milestone 2b β€” Hunyuan3D-2 Alternative Generator

Goal: Second generator option for organic shapes (characters, creatures).

Approach: Same duplicate-and-vendor pattern as Milestone 2.

  1. Duplicate tencent/Hunyuan3D-2 to staging Space
  2. Clone, copy the hy3dgen/ package into open3dforge
  3. Merge requirements (most overlap with TRELLIS.2 β€” torch, diffusers)
  4. Add Hunyuan pipeline init at module level
  5. The model dropdown in the Generate tab routes between image_to_3d_trellis() and image_to_3d_hunyuan()
  6. Hunyuan needs 16GB VRAM β€” fits alongside TRELLIS in H200's 70GB but only load one at a time via lazy module-level guards

Decision deferred to this milestone: Whether to keep both models in VRAM at module load (faster, more memory) or lazy-load per call (slower first call, less memory). Test both.


Milestone 3 β€” Stage 2A-2C: Mesh Cleanup

Goal: Working CPU-side mesh repair, cleanup, and decimation with live preview.

Dependencies to add:

trimesh[easy]
pymeshfix
pymeshlab
fast-simplification

Files to create:

  • src/stages/__init__.py
  • src/stages/stage2_repair.py β€” pymeshfix wrapper
  • src/stages/stage2_cleanup.py β€” PyMeshLab filter chain
  • src/stages/stage2_decimate.py β€” both fast-simplification (preview) and PyMeshLab (final)

UI work in app.py:

  • Wire the existing checkboxes/sliders in Tab 2 to call the real implementations
  • Live preview: slider .change() event fires fast-simplification, updates face count display
  • Run button: actually runs full pipeline on the current GLB

Workspace state updates:

  • state.repaired_glb, state.cleaned_glb, state.low_poly_glb all get populated as steps complete

Test criteria:

  • Generate a TRELLIS asset (50k faces)
  • Run repair β†’ no errors
  • Run cleanup β†’ no errors
  • Set decimation slider to 10k β†’ live preview updates face count
  • Click "Run final" β†’ produces low_poly.glb at 10k faces
  • Viewer auto-refreshes to show the cleaned mesh

Milestone 4 β€” Stage 2D-2E: Symmetry + UV Unwrap

Goal: Symmetry enforcement + xatlas UV unwrapping with consistent texel density.

Dependencies to add:

xatlas

Files to create:

  • src/stages/stage2_symmetry.py β€” PyMeshLab apply_filter_mesh_symmetrize
  • src/stages/stage2_uv.py β€” xatlas with texels_per_unit packing

UI work:

  • Symmetry: off / bilateral-X / bilateral-Y / radial dropdown
  • UV: atlas resolution, texels_per_unit, padding

Test criteria:

  • Run on a human-character GLB β†’ symmetry produces clean mirror
  • UV unwrap produces unwrapped.glb with valid UV0 coords visible if you inspect via trimesh
  • No overlapping UV islands (check with PyMeshLab's quality measure)

Milestone 5 β€” Stage 2F: Normal Baking with nvdiffrast

Goal: High-poly β†’ low-poly normal map baking, GPU-accelerated, 2-5 second bakes.

Dependencies to add:

  • nvdiffrast (already installed via TRELLIS.2 wheels β€” verify in the staging duplicate)

Files to create:

  • src/stages/stage2_bake_normal.py β€” full nvdiffrast pipeline

Algorithm (from the plan doc):

@spaces.GPU(duration=60)
def bake_normal_map(high_poly_path, low_poly_path, uv_coords, map_size=2048):
    ctx = dr.RasterizeCudaContext()
    # 1. UV β†’ clip space
    # 2. Rasterize low-poly UVs β†’ per-pixel world position + tri ID
    # 3. For each pixel: nearest-on-surface from high-poly
    # 4. Sample high-poly normal at that point
    # 5. Transform to tangent space (low-poly tangent frame)
    # 6. Pack RGB [0,1], save PNG
    # 7. Dilate edges past UV island boundaries

Output: Two PNGs β€” normal_gl.png and normal_dx.png (DX has Y-flipped green channel).

Test criteria:

  • Run on TRELLIS character output (50k high-poly β†’ 10k low-poly)
  • Bake completes in <10 seconds
  • Open the normal map in any image viewer β€” should be bluish/purple with surface detail visible
  • Both DX and GL versions are produced
  • Quota shows 5-10 seconds consumed

Milestone 6 β€” Stage 2G-2I: Albedo, Material, AO Baking

Goal: Three more nvdiffrast bakes producing the full PBR texture set.

Files to create:

  • src/stages/stage2_bake_albedo.py
  • src/stages/stage2_bake_material.py β€” uses TRELLIS.2's stored metallic+roughness attrs
  • src/stages/stage2_bake_ao.py β€” ray-occlusion in hemisphere

Key reuse: Same nvdiffrast rasterization pattern as Milestone 5 β€” refactor that code into a shared helper _rasterize_uv_atlas() in src/stages/_baking_helpers.py.

Workspace state: All texture paths populated on the AssetState.

Test criteria:

  • All four maps (normal, albedo, metallic, roughness, AO) viewable as PNG thumbnails in Tab 2
  • Total Stage 2 baking time < 30 seconds for a Balanced-quality asset

Milestone 7 β€” Stage 2J: SDXL Inpainting for Hidden UVs

Goal: Detect stretched/synthetic UV regions and inpaint them with SDXL.

Dependencies to add:

diffusers
accelerate
safetensors

Files to create:

  • src/stages/stage2_inpaint.py
    • detect_hidden_regions(albedo, uvs, faces) β€” variance analysis
    • inpaint_hidden_uvs(...) β€” SDXL inpainting pipeline

UI: Toggle off by default (costs ~30s quota). Prompt input. Strength slider.

Test criteria:

  • Generate an asset with a clear "back side" (e.g., a humanoid character)
  • Without inpainting: back of character has visible texture stretching
  • With inpainting: back is plausibly filled in
  • Quota cost: ~30s per inpaint

Milestone 8 β€” Stage 2K-2O: Finalization Steps

Goal: Channel packing, LODs, collision, pivot, scale β€” all CPU-side, fast.

Dependencies to add:

coacd==1.0.4

Files to create:

  • src/stages/stage2_channel_pack.py β€” numpy ORM / MetallicSmoothness packing
  • src/stages/stage2_lods.py β€” PyMeshLab quality-aware LOD0/1/2
  • src/stages/stage2_collision.py β€” CoACD with trimesh.convex_hull fallback
  • src/stages/stage2_pivot.py β€” bottom_center / geometric_center / custom
  • src/stages/stage2_scale.py β€” height presets, UE5 cm units

UI: All controls already scaffolded in Milestone 1's Post-Process tab. Just wire to real implementations.

Test criteria:

  • ORM packed as RGB with AO/Roughness/Metallic in correct channels
  • LOD0/LOD1/LOD2 all generated, all share same UV layout
  • Collision mesh has <1% the triangle count of LOD0
  • Pivot at bottom_center for a generated human character results in feet at world origin Y=0
  • Scale: human asset is 1.8m tall = 180cm in UE5 export

Milestone 9 β€” Stage 3: UniRig Auto-Rigging

Goal: Generate a skeleton + skinning weights for character meshes.

Approach: Same vendor-the-Space pattern as Milestone 2.

  1. Duplicate MohamedRashad/UniRig Space β†’ staging
  2. Verify it builds in the staging duplicate
  3. Copy UniRig/ package into our repo
  4. Merge requirements
  5. Wire to the Auto-Rig tab handler
  6. Output: rigged FBX (UE5 default) or GLB

Test criteria:

  • Run on a humanoid character (after full Stage 2 processing)
  • Output FBX imports into UE5 as a Skeletal Mesh
  • Drag into Mixamo β†’ animations auto-attach correctly

Milestone 10 β€” Stage 4: UE5 Export

Goal: Bundle everything into a UE5-ready zip with proper naming and packing.

Dependencies to add:

pygltflib

Files to create:

  • src/stages/stage4_export.py
    • export_ue5(asset_state, asset_name, asset_type) β†’ zip_path
    • Handles FBX conversion via trimesh
    • Applies naming convention (SM_, SK_, T_)
    • Writes ORM-packed textures to correct paths
    • Zip + drop in workspace/exports/

Engine presets (only UE5 fully implemented):

  • UE5: FBX, DX normals, ORM, Z-up, cm β€” the default
  • Unity HDRP: FBX, GL normals, MetallicSmoothness, Y-up, m β€” stub for later
  • Godot/Blender/Web: stubs

Test criteria:

  • Export a character β†’ unzip β†’ 6-7 files following naming convention
  • Import to UE5: drag-drop the zip's contents β†’ no warnings, materials auto-create from textures
  • Both Static Mesh and Skeletal Mesh paths work

Milestone 11 β€” Presets System

Goal: Save and load named parameter configurations across tabs.

Files to update:

  • src/workspace.py β€” already has save_preset/load_preset/delete_preset, just needs the JSON schema fleshed out
  • app.py β€” wire the Presets tab's Save button to actually read all current tab values

Schema:

{
  "name": "character_UE5_hero",
  "stage1": { ... },
  "stage2": { ... },
  "stage3": { ... },
  "stage4": { ... }
}

Ship five default presets:

  • character_UE5_hero.json
  • character_UE5_npc.json
  • prop_UE5_hero.json
  • prop_UE5_standard.json
  • environment_UE5_background.json

Milestone 12 β€” Polish & Production Hardening

  • Error handling on every stage (don't crash the app, show clear error in UI)
  • Progress bars during long ops (gr.Progress(track_tqdm=True))
  • Quota cost shown before each GPU operation (warning if it would exceed remaining)
  • Game-ready checklist passes shown before allowing Export
  • Asset history sidebar (last 5 generated assets with thumbnails)
  • Session cleanup of workspace/current/ on new generation

Working with Claude Code

When you continue in Claude Code, you'll have the full repo locally. Key things to remember:

Project conventions

  1. Each stage = its own module in src/stages/. Don't dump pipeline logic into app.py.
  2. Workspace state is the single source of truth. Every stage reads from and writes to workspace.get_state().
  3. GPU functions live where they're needed, not all in app.py. The @spaces.GPU decorator works in any file as long as spaces is imported.
  4. No if __name__ == "__main__": on demo.launch(). HF Spaces imports app.py at module level.
  5. Gradio 6 specifics:
    • theme and css go in launch(), not Blocks()
    • show_api is gone β€” use footer_links=["gradio", "settings"]
    • api_visibility replaces api_name=False on events
  6. The 3 global components (viewer, summary, status_bar) get refreshed via _global_refresh() chained off every pipeline action button. Don't forget to add new buttons to that list.

Useful commands

# Pull the latest Space state
cd open3dforge
git pull

# Make changes, syntax-check before push
python -c "import ast; ast.parse(open('app.py').read())"

# Push to deploy
git add -A
git commit -m "Milestone N: <stage>"
git push

# Watch build/runtime logs at:
# https://huggingface.co/spaces/Reverb/open3dforge?logs=container

Common HF Space build failures (we've hit these)

Symptom Cause Fix
Cannot install gradio<X and gradio==Y sdk_version in README conflicts with requirements.txt pin Remove version pin in requirements.txt or update README's sdk_version
Blocks.launch() got an unexpected keyword argument 'X' Gradio 6 removed parameter Check Gradio 6 migration guide for replacement
When localhost is not accessible demo.launch() wrapped in if __name__ == "__main__" Move to module level
CUDA wheel compile failures Mismatched torch/CUDA versions Match TRELLIS.2 staging duplicate's exact pins
OOM during model load Multiple large models loaded at module level Lazy-load with module-level guards inside @spaces.GPU

Useful resources


Constraints to Remember

  • Daily quota: 1500s (25 min) of H200 time per day. Plan asset iteration accordingly.
  • VRAM budget: ~70GB per workload. TRELLIS.2 alone is 24GB; UniRig is 8GB; SDXL inpaint is 8GB. Don't load all at once.
  • Function timeout: Default @spaces.GPU duration is 60s. Override with duration=N for longer ops (Stage 1 generation, AO bake high quality).
  • Build time: With TRELLIS.2 vendored + CUDA wheels, expect 10-15 min builds. Cache hits will be ~3 min.
  • Repo size: Will grow large with vendored models + HDRIs. Git LFS may be needed for the autotune_cache.json (1MB) and wheel files (100MB+). HF Spaces handles this via Xet storage automatically.

Plan version 3.0 β€” May 15, 2026 Last action completed: Milestone 1 deployed, ZeroGPU smoke test passing Next action: Milestone 2 β€” duplicate microsoft/TRELLIS.2 staging Space, merge into open3dforge