WavTTS: Towards High-Quality Zero-Shot TTS via Direct Raw Waveform Modeling Paper • 2606.03455 • Published 4 days ago
SoulX-Podcast: Towards Realistic Long-form Podcasts with Dialectal and Paralinguistic Diversity Paper • 2510.23541 • Published Oct 27, 2025 • 17
SAC: Neural Speech Codec with Semantic-Acoustic Dual-Stream Quantization Paper • 2510.16841 • Published Oct 19, 2025
worstchan/EAT-base_epoch30_finetune_AS2M Feature Extraction • 90.4M • Updated May 6, 2025 • 16.7k • 3
view article Article Efficient LLM Pretraining: Packed Sequences and Masked Attention sirluk • Oct 7, 2024 • 71