Image Classification
Transformers
Tibetan
tibetan
page-orientation
dinov3

Tibetan page orientation classifier (DINOv3 ViT-S)

Predict whether a tibetan manuscript page image is upright (non_flipped) or upside-down (flipped, 180°).

Fine-tuned DINOv3 ViT-S.

Dataset (HF): BDRC/tibetan-page-orientation-classifier-dataset
Experiment: dinov3_binary_flip_center_cropped
Checkpoint: final_model.pt (best validation macro-F1 across stages A + B + C)

Preprocessing (inference)

Split Mode
train center_crop
val center_crop
test center_crop

Use the same mode as training before the DINO image processor (size 448).

Test metrics (n=854)

Metric Value
Accuracy 100.0%
Macro F1 1.000
AUC-ROC 1.000
Loss 0.1237

Training config

Setting Value
Stages A + B + C
Epochs A / B 5 / 10
LR head A 0.0005
LR backbone B 5e-06
LR head B 5e-05
Unfreeze blocks B 4
Scheduler cosine_warmup
Class weights none
Label smoothing 0.05

Training history

Training curves

Per-epoch metrics: training_history.json and results.jsonhistory.

Files

File Description
final_model.pt Weights + idx_to_label + test metrics
results.json Full training config, history, report
training_history.json Stage A/B epoch logs
confusion_matrix.json Machine-readable CM
confusion_matrix.png Plot
training_history.png Loss / F1 curves
model_card.json Summary metadata
config.yaml Training hyperparameters (copy)
inference_classifier.py CLI inference on image paths

Load weights

import torch
ckpt = torch.load("final_model.pt", map_location="cpu", weights_only=False)
print(ckpt["test_metrics"])
print(ckpt["idx_to_label"])

Inference

Use the same preprocessing as training (center_crop, size 448). The bundled script reads defaults from model_card.json when flags are omitted.

pip install -r requirements-inference.txt
python inference_classifier.py --checkpoint final_model.pt --image page.jpg

Explicit flags (recommended for reproducibility):

python inference_classifier.py \
  --checkpoint final_model.pt \
  --image page.jpg \
  --preprocess center_crop \
  --preprocess-size 448

Classes: non_flipped = upright page; flipped = upside-down (180° rotation).

Citation

@misc{bdrcpageorientationmodel,
  title  = {Tibetan Page Orientation Classifier (DINOv3)},
  author = {Buddhist Digital Resource Center and OpenPecha},
  year   = {2026},
  url    = {https://huggingface.co/BDRC/tibetan-page-orieantation-classifier},
  dataset = {https://hunggingface.co/BDRC/tibetan-page-orientation-classifier-dataset},
  experiment = {center-cropped}
  note   = {Trained on BDRC manuscript images}
}

License

Model weights and inference code: Apache License 2.0.

Acknowledgements

All images are provided by the Buddhist Digital Resource Center (BDRC). This dataset was developed by Dharmaduta from specifications provided by BDRC for the project "The BDRC Etext Corpus", with funding from the Khyentse Foundation. Buddhist Digital Resource Center (BDRC). Developed by Dharmaduta / OpenPecha.

Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for BDRC/tibetan-page-orientation-classifier

Dataset used to train BDRC/tibetan-page-orientation-classifier