Arabic End-of-Utterance (EOU) Detection Model

This model detects whether a speaker has finished their turn in Arabic conversations, with emphasis on Saudi dialect. It's designed for real-time voice agent applications like LiveKit.

Model Description

Base Model: UBC-NLP/MARBERT
Language: Arabic (with focus on Saudi/Gulf dialect)
Task: Binary classification (Complete vs Incomplete utterance)
Use Case: Real-time turn detection for voice agents

Labels

Label	ID	Description
INCOMPLETE	0	Speaker has not finished their turn
COMPLETE	1	Speaker has finished their turn

Performance

Out-of-Distribution Test Results (200 samples)

Metric	Complete (1)	Incomplete (0)
Precision	100.00%	85.94%
Recall	83.64%	100.00%
F1-Score	91.09%	92.44%

Overall Weighted F1: 91.76%

Key Characteristics

✅ Zero false interruptions - Model never incorrectly predicts "Complete" for incomplete utterances
✅ Conservative behavior - Ideal for voice agents (better to wait than interrupt)
✅ Fast inference - Suitable for real-time applications

Usage

Quick Start

from transformers import pipeline

# Load the model
eou_detector = pipeline(
    "text-classification",
    model="Amr-h/arabic-eou-marbert",
    device=0  # Use GPU, or -1 for CPU
)

# Detect end of utterance
text = "هل بلغوك انهم بيحتاجون ساعات اضافيه؟"
result = eou_detector(text)[0]

print(f"Label: {result['label']}")  # COMPLETE or INCOMPLETE
print(f"Confidence: {result['score']:.2%}")

With Model and Tokenizer

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

# Load model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained("Amr-h/arabic-eou-marbert")
tokenizer = AutoTokenizer.from_pretrained("Amr-h/arabic-eou-marbert")

# Inference
text = "انتظر خلني اشوف وين حطيت ال"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=64)

with torch.no_grad():
    outputs = model(**inputs)
    prediction = torch.argmax(outputs.logits, dim=-1).item()

label = "COMPLETE" if prediction == 1 else "INCOMPLETE"
print(f"Prediction: {label}")

For LiveKit Integration

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

class ArabicEOUDetector:
    def __init__(self, model_name="Amr-h/arabic-eou-marbert"):
        self.model = AutoModelForSequenceClassification.from_pretrained(model_name)
        self.tokenizer = AutoTokenizer.from_pretrained(model_name)
        self.model.eval()
        
    def predict(self, text: str) -> tuple[bool, float]:
        """
        Returns (is_complete, confidence)
        """
        inputs = self.tokenizer(
            text, 
            return_tensors="pt", 
            truncation=True, 
            max_length=64
        )
        
        with torch.no_grad():
            outputs = self.model(**inputs)
            probs = torch.softmax(outputs.logits, dim=-1)
            prediction = torch.argmax(probs, dim=-1).item()
            confidence = probs[0][prediction].item()
        
        is_complete = prediction == 1
        return is_complete, confidence

Training Details

Training Data: ~12,000 Saudi dialect Arabic utterances
Validation Data: ~1,500 samples
Test Data: ~1,500 samples
Epochs: 3
Learning Rate: 2e-5
Batch Size: 32
Max Length: 64 tokens

Intended Use

Real-time voice agents and conversational AI
Turn-taking detection in Arabic dialogue systems
LiveKit agent integration
Customer service voice bots

Limitations

Optimized for Saudi/Gulf Arabic dialect
May require fine-tuning for other Arabic dialects
Designed for spoken/conversational text, not formal written Arabic

Citation

If you use this model, please cite:

@misc{arabic-eou-marbert,
  author = {YOUR_NAME},
  title = {Arabic End-of-Utterance Detection Model},
  year = {2024},
  publisher = {Hugging Face},
  url = {https://huggingface.co/Amr-h/arabic-eou-marbert}
}

License

Apache 2.0

Downloads last month: -

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for Amr-h/arabic-eou-marbert

Base model

UBC-NLP/MARBERT

Finetuned

(24)

this model

Dataset used to train Amr-h/arabic-eou-marbert

Evaluation results

OOD Accuracy
self-reported

0.918
Weighted F1
self-reported

0.918