Instructions to use jankoko/SpecAugment-Whisper-small with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use jankoko/SpecAugment-Whisper-small with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="jankoko/SpecAugment-Whisper-small")# Load model directly from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq processor = AutoProcessor.from_pretrained("jankoko/SpecAugment-Whisper-small") model = AutoModelForSpeechSeq2Seq.from_pretrained("jankoko/SpecAugment-Whisper-small") - Notebooks
- Google Colab
- Kaggle
This model is a fine-tuned version of openai/whisper-small on the wTIMIT-US dataset using SpecAugment, a time- and frequency-masking data augmentation method.
The model was fine-tuned jointly on normal and whispered speech, using SpecAugment in its LibriSpeech Double (LD) configuration. It serves as a baseline for comparison against phone-aware masking methods such as F0-Mask, F1-Mask, and LF-Mask.
Evaluation Results on wTIMIT-US (Test Set)
| Setup | Training Data | Augmentation | WER (Normal) | WER (Whispered) |
|---|---|---|---|---|
| Baseline | Both modes | None | 5.8 | 11.7 |
| SpecAugment | Both modes | SpecAugment (LD) | 5.2 | 12.3 |
SpecAugment significantly improved WER on normal speech compared to the baseline without augmentation (p=0.014), while showing no statistically significant difference in whispered speech performance (p=0.147).
Cite as
Kokowski, J. (2025). F0-Based Masking Policies for Self-Supervised Whispered Speech Recognition. Masterโs Thesis, University of Groningen, Campus Fryslรขn.
Available at: https://campus-fryslan.studenttheses.ub.rug.nl/id/eprint/674
If you use this model or build upon this work, please cite the thesis above.
Model: Whisper-small
Augmentation: SpecAugment (LD)
Evaluation toolkit: SCTK (sclite)
Notes: For statistical comparisons and MAPSSWE evaluation, see Section 5 of the thesis.
๐ Related Models
- SpecAugment โ current
- F0-Mask Version
- F1-Mask Version
- LF-Mask Version
- Downloads last month
- 2
Model tree for jankoko/SpecAugment-Whisper-small
Base model
openai/whisper-small