mozilla-foundation/common_voice_17_0
Updated • 4.65k • 27
This is a compact multilingual self-supervised speech encoder (HuBERT-style) trained for one iteration. It was trained on over 10,000 hours of African language data aggregated from various sources. According to the paper, this is the AfriHuBERT-s model. For the stronger variants with mHuBERT-147 as backbone, you can click here for the AfriHuBERT-o model, and here for the AfriHuBERT-n model.

AfriHuBERT covers 1,230 languages in total including 1,226 indigenous African languages
@misc{alabi2024afrihubertselfsupervisedspeechrepresentation,
title={AfriHuBERT: A self-supervised speech representation model for African languages},
author={Jesujoba O. Alabi and Xuechen Liu and Dietrich Klakow and Junichi Yamagishi},
year={2024},
eprint={2409.20201},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2409.20201},
}