Efik ASR β€” XLS-R 300M Fine-tuned + KenLM

Fine-tuned facebook/wav2vec2-xls-r-300m on Efik language speech data, with a 3-gram KenLM language model for beam search decoding.

Files

File Purpose
pytorch_model.bin Fine-tuned model weights
vocab.json CTC vocabulary
efik.arpa KenLM 3-gram language model
efik_corpus.txt Text corpus used to build KenLM

Results (Eval Set β€” 527 samples)

Metric Score
WER 10.86%
CER 3.16%
Epochs 10
Base facebook/wav2vec2-xls-r-300m

Evaluated on held-out eval set (20% split). No errors on all 527 samples.

Reuse for Inference

from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
from pyctcdecode import build_ctcdecoder
from huggingface_hub import hf_hub_download
import torch, librosa
import kenlm
import pyctcdecode.decoder
import pyctcdecode.language_model

REPO_ID = "your-username/efik-xlsr-300m"

# Load model + processor
processor = Wav2Vec2Processor.from_pretrained(REPO_ID)
model = Wav2Vec2ForCTC.from_pretrained(REPO_ID)
model.eval()

# Download and patch KenLM
arpa_path = hf_hub_download(REPO_ID, "efik.arpa")
pyctcdecode.decoder.kenlm = kenlm
pyctcdecode.language_model.kenlm = kenlm

vocab = processor.tokenizer.get_vocab()
labels = [k for k, v in sorted(vocab.items(), key=lambda x: x[1])]
decoder = build_ctcdecoder(labels=labels, kenlm_model_path=arpa_path, alpha=0.5, beta=1.0)

# Transcribe
def transcribe(audio_path):
    speech, sr = librosa.load(audio_path, sr=16000)
    inputs = processor(speech, sampling_rate=sr, return_tensors="pt", padding=True)
    with torch.no_grad():
        logits = model(**inputs).logits
    return decoder.decode(logits[0].numpy()).replace("|", " ")

Reuse for Further Fine-tuning

from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor

REPO_ID = "your-username/efik-xlsr-300m"

# Load and continue training from checkpoint
processor = Wav2Vec2Processor.from_pretrained(REPO_ID)
model = Wav2Vec2ForCTC.from_pretrained(REPO_ID)

# Freeze feature extractor (same as original training)
for param in model.wav2vec2.feature_extractor.parameters():
    param.requires_grad = False

# Then set up Trainer as before with new data...
Downloads last month
5
Safetensors
Model size
0.3B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support