Lite-BERT-SL: Sequence Labeling for HiFi-KPI Lite

Lite-BERT-SL is a BERT-based sequence labeling model fine-tuned on the HiFi-KPI Lite dataset. This model was introduced in the paper HiFi-KPI: A Dataset for Hierarchical KPI Extraction from Earnings Filings.

Model Description

The model is designed for the hierarchical extraction of Key Performance Indicators (KPIs) from financial earnings filings (SEC 10-K and 10-Q reports). While the full HiFi-KPI dataset contains a massive taxonomy of iXBRL tags, Lite-BERT-SL is fine-tuned on a manually curated subset focusing on four expert-mapped KPI clusters:

  • Revenues

  • Earnings

  • EPS (Earnings Per Share)

  • EBIT (Earnings Before Interest and Taxes)

  • Developed by: Rasmus Aavang, Giovanni Rizzi, Rasmus Bøggild, Alexandre Iolov, Mike Zhang, Johannes Bjerva

  • Model type: Token Classification (Sequence Labeling)

  • Base Model: bert-base-uncased

  • Language: English

Use Cases

  • Identifying and extracting generalized financial KPIs from earnings filings.
  • Automating the parsing of SEC 10-K and 10-Q reports for structured data extraction.
  • Assisting in the alignment of financial text with iXBRL taxonomies.

Performance

According to the paper, encoder-based models achieve over 0.906 macro-F1 on the HiFi-KPI Lite classification task. For detailed performance metrics, please refer to the paper and the HiFi-KPI Lite dataset page.

Dataset & Code

Citation

If you use this model or the dataset in your research, please cite:

@article{aavang2025hifikpi,
  title={HiFi-KPI: A Dataset for Hierarchical KPI Extraction from Earnings Filings},
  author={Aavang, Rasmus and Rizzi, Giovanni and B{\o}ggild, Rasmus and Iolov, Alexandre and Zhang, Mike and Bjerva, Johannes},
  journal={arXiv preprint arXiv:2502.15411},
  year={2025}
}
Downloads last month
26
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AAU-NLP/Lite-BERT-SL

Finetuned
(6665)
this model

Dataset used to train AAU-NLP/Lite-BERT-SL

Paper for AAU-NLP/Lite-BERT-SL