Lite-BERT-SL: Sequence Labeling for HiFi-KPI Lite

Lite-BERT-SL is a BERT-based sequence labeling model fine-tuned on the HiFi-KPI Lite dataset. This model was introduced in the paper HiFi-KPI: A Dataset for Hierarchical KPI Extraction from Earnings Filings.

Model Description

The model is designed for the hierarchical extraction of Key Performance Indicators (KPIs) from financial earnings filings (SEC 10-K and 10-Q reports). While the full HiFi-KPI dataset contains a massive taxonomy of iXBRL tags, Lite-BERT-SL is fine-tuned on a manually curated subset focusing on four expert-mapped KPI clusters:

Revenues
Earnings
EPS (Earnings Per Share)
EBIT (Earnings Before Interest and Taxes)
Developed by: Rasmus Aavang, Giovanni Rizzi, Rasmus Bøggild, Alexandre Iolov, Mike Zhang, Johannes Bjerva
Model type: Token Classification (Sequence Labeling)
Base Model: bert-base-uncased
Language: English

Use Cases

Identifying and extracting generalized financial KPIs from earnings filings.
Automating the parsing of SEC 10-K and 10-Q reports for structured data extraction.
Assisting in the alignment of financial text with iXBRL taxonomies.

Performance

According to the paper, encoder-based models achieve over 0.906 macro-F1 on the HiFi-KPI Lite classification task. For detailed performance metrics, please refer to the paper and the HiFi-KPI Lite dataset page.

Dataset & Code

Paper: HiFi-KPI: A Dataset for Hierarchical KPI Extraction from Earnings Filings
Dataset: HiFi-KPI Lite on Hugging Face
Code: Official HiFi-KPI GitHub Repository

Citation

If you use this model or the dataset in your research, please cite:

@article{aavang2025hifikpi,
  title={HiFi-KPI: A Dataset for Hierarchical KPI Extraction from Earnings Filings},
  author={Aavang, Rasmus and Rizzi, Giovanni and B{\o}ggild, Rasmus and Iolov, Alexandre and Zhang, Mike and Bjerva, Johannes},
  journal={arXiv preprint arXiv:2502.15411},
  year={2025}
}

Downloads last month: 26

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for AAU-NLP/Lite-BERT-SL

Base model

google-bert/bert-base-uncased

Finetuned

(6665)

this model

Dataset used to train AAU-NLP/Lite-BERT-SL

Paper for AAU-NLP/Lite-BERT-SL

HiFi-KPI: A Dataset for Hierarchical KPI Extraction from Earnings Filings

Paper • 2502.15411 • Published Feb 21, 2025 • 2