MuseTalk Mirror (A.I.M.I)

Mirror of TMElyralab/MuseTalk V1.5 plus its inference-time dependencies, re-hosted for stable URLs inside the A.I.M.I desktop product. Contents are unmodified.

MuseTalk re-syncs the lips of an existing video to match a new audio track (mouth-region editing, rest of frame passes through). Pairs with our TTS + Voice-Clone stack for full "text → lip-synced video" workflows.

Files

Folder / File	Upstream	Size	Purpose
`musetalkV15/unet.pth`	TMElyralab/MuseTalk	3.24 GB	MuseTalk V1.5 UNet weights
`musetalkV15/musetalk.json`	TMElyralab/MuseTalk	748 B	UNet config
`sd-vae-ft-mse/diffusion_pytorch_model.bin`	stabilityai/sd-vae-ft-mse	319 MB	VAE for face latents
`sd-vae-ft-mse/config.json`	stabilityai/sd-vae-ft-mse	547 B	VAE config
`whisper/pytorch_model.bin`	openai/whisper-tiny	144 MB	Audio feature extraction (tiny)
`dwpose/dw-ll_ucoco_384.pth`	yzd-v/DWPose	388 MB	Face bbox + pose detection
`face-parse-bisent/79999_iter.pth`	ManyOtherFunctions/face-parse-bisent	51 MB	BiSeNet face-region parser
`face-parse-bisent/resnet18-5c106cde.pth`	pytorch.org/models	45 MB	ResNet18 backbone for face-parser

Total: ~4.1 GB.

Licenses

Component	License
MuseTalk	MIT (Tencent Music Entertainment Lyra Lab)
SD-VAE-ft-MSE	CreativeML Open RAIL-M (Stability AI)
Whisper	MIT (OpenAI)
DWPose	Apache 2.0
face-parse-bisent	MIT
ResNet18 (pretrained)	BSD-3-Clause (PyTorch / Facebook)

All components are commercial-use-compatible. Redistributed unchanged. See upstream repos for full license texts.

Attribution

MuseTalk: Yue Zhang, Minhao Liu, Zhaokang Chen, Bin Wu, Yubin Zeng, Chao Zhan, Yingjie He, Junxin Huang, Wenjiang Zhou — MuseTalk: Real-Time High Quality Lip Synchronization with Latent Space Inpainting (2024).
Whisper: Alec Radford et al. — Robust Speech Recognition via Large-Scale Weak Supervision (OpenAI, 2022).
DWPose: Zhendong Yang, Ailing Zeng, Chun Yuan, Yu Li — Effective Whole-body Pose Estimation with Two-stages Distillation (ICCV 2023).
BiSeNet: Changqian Yu et al. — BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation (ECCV 2018).

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support