Wenxi Chen's picture

Wenxi Chen

worstchan

·

https://cwx-worst-one.github.io/

cwx-worst-one

AI & ML interests

understanding & generation in speech and audio

Recent Activity

updated a model about 10 hours ago

worstchan/WavTTS

new activity 3 days ago

worstchan/WavTTS:FP8 Quantization?

authored a paper 3 days ago

WavTTS: Towards High-Quality Zero-Shot TTS via Direct Raw Waveform Modeling

View all activity

Organizations

updated a model about 10 hours ago

worstchan/WavTTS

Text-to-Speech • Updated about 10 hours ago • 18 • 4

New activity in worstchan/WavTTS 3 days ago

FP8 Quantization?

#1 opened 3 days ago by

authored a paper 3 days ago

WavTTS: Towards High-Quality Zero-Shot TTS via Direct Raw Waveform Modeling

Paper • 2606.03455 • Published 4 days ago

published a model 4 days ago

worstchan/WavTTS

Text-to-Speech • Updated about 10 hours ago • 18 • 4

upvoted a collection 3 months ago

SoulX-Duplug

3 items • Updated Mar 17 • 5

liked a model 5 months ago

worstchan/EAT-base_epoch30_pretrain

Feature Extraction • 90M • Updated May 6, 2025 • 3.5k • 6

liked a dataset 6 months ago

tutu0604/UltraVoice

Viewer • Updated Nov 13, 2025 • 101k • 703 • 15

upvoted a paper 7 months ago

SoulX-Podcast: Towards Realistic Long-form Podcasts with Dialectal and Paralinguistic Diversity

Paper • 2510.23541 • Published Oct 27, 2025 • 17

liked a model 7 months ago

Soul-AILab/SoulX-Podcast-1.7B

Text-to-Speech • 2B • Updated Dec 18, 2025 • 239 • 234

upvoted a collection 7 months ago

SoulX-Podcast

Models of SoulX-Podcast • 3 items • Updated Mar 2 • 47

updated 2 models 8 months ago

Soul-AILab/SAC-16k-62_5Hz

Audio-to-Audio • Updated Oct 24, 2025 • 25 • 2

Soul-AILab/SAC-16k-37_5Hz

Audio-to-Audio • Updated Oct 24, 2025 • 11

authored a paper 8 months ago

SAC: Neural Speech Codec with Semantic-Acoustic Dual-Stream Quantization

Paper • 2510.16841 • Published Oct 19, 2025

updated a collection 8 months ago

SAC

Models of the SAC speech codec • 3 items • Updated Dec 2, 2025 • 3

liked a model 8 months ago

worstchan/EAT-base_epoch30_finetune_AS2M

Feature Extraction • 90.4M • Updated May 6, 2025 • 16.7k • 3

updated a collection 8 months ago

SAC

Models of the SAC speech codec • 3 items • Updated Dec 2, 2025 • 3

upvoted a collection 8 months ago

SAC

Models of the SAC speech codec • 3 items • Updated Dec 2, 2025 • 3

upvoted an article 9 months ago

Article

Efficient LLM Pretraining: Packed Sequences and Masked Attention

sirluk

•

Oct 7, 2024

• 71

liked a model 10 months ago

AndreasXi/MeanAudio

Updated Aug 30, 2025 • 103 • 8