Generate depth maps from your photos
Generate depth video from input video
Segment images with click points and download cutouts
Transcribe audio files to text instantly
Generate captions for your images instantly
Conversational speech generation
Convert typed text into spoken audio
Transcribe audio to text instantly using WebGPU