OneFormer
Segment images into objects, instances, or scenes
Segment images into objects, instances, or scenes
Reconstruct 3D scenes from images using MASt3R and 3DGS
Image-to-3D Generation
Image → 3D Human Mesh (GLB) - MCP Server
Transcribe audio files into text
Find similar images in a dataset with a single upload
4k Image from text in 5 second
Transcribe or translate audio from microphone, file, or YouTube
Convert audio to subtitles
Transcribe audio to text from microphone, file, or YouTube video
Transcribe and translate YouTube video audio
Video Dubbing with Open Source Projects
Transcribe or translate audio and YouTube videos to text
Generate depth maps and 3D views from photos
Transcribe audio files into text