Mongolian News Headline Classifier

This model is a fine-tuned version of bert-base-multilingual-cased designed to classify Mongolian news headlines into 9 different categories.

Categories

  1. Байгал орчин (Environment)
  2. Боловсрол (Education)
  3. Спорт (Sports)
  4. Технологи (Technology)
  5. Улс төр (Politics)
  6. Урлаг соёл (Arts & Culture)
  7. Хууль (Law)
  8. Эдийн засаг (Economy)
  9. Эрүүл мэнд (Health)

Training Metrics & Results

The model was trained on a custom Mongolian news dataset for 3 epochs. According to the training logs (trainer_state.json), the top performance metrics are:

  • Validation Accuracy: 87.11%
  • Validation Loss: 0.4071
  • Final Training Loss: 0.3286
  • Global Steps: 1596

Training Logs Summary

Step Training Loss Validation Loss Validation Accuracy
500 0.5081 0.4627 85.52%
1000 0.4102 0.4344 86.17%
1500 0.3299 0.4112 87.08%
1596 - 0.4071 87.11%

Usage

You can use this model easily with the Hugging Face pipeline:

from transformers import pipeline

# Load the classifier
classifier = pipeline("text-classification", model="Batuka0901/mongolian_news_classifier")

# Predict
headline = "Өнөөдөр хөлбөмбөгийн тэмцээнд Монголын шигшээ баг хожлоо."
result = classifier(headline)

print(result)
# Output: [{'label': 'спорт', 'score': 0.98...}]
Downloads last month
38
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using Batuka0901/mongolian_news_classifier 1

Collection including Batuka0901/mongolian_news_classifier