HiFi-KPI: A Dataset for Hierarchical KPI Extraction from Earnings Filings
Rasmus Aavang, Giovanni Rizzi, Rasmus B{\o}ggild, Alexandre Iolov, Mike Zhang, Johannes Bjerva

TL;DR
This paper introduces HiFi-KPI, a large hierarchical KPI dataset from earnings reports, enabling improved extraction and classification of financial indicators using advanced models, with open-source tools for research.
Contribution
We present HiFi-KPI, a comprehensive hierarchical KPI dataset from earnings filings, supporting multiple extraction and classification tasks, and provide baseline results with open-source code.
Findings
Encoder models achieve over 0.906 macro-F1 on KPI classification.
LLMs reach 0.440 F1 on structured KPI extraction.
Extraction errors mainly involve date-related information.
Abstract
Accurate tagging of earnings reports can yield significant short-term returns for stakeholders. The machine-readable inline eXtensible Business Reporting Language (iXBRL) is mandated for public financial filings. Yet, its complex, fine-grained taxonomy limits the cross-company transferability of tagged Key Performance Indicators (KPIs). To address this, we introduce the Hierarchical Financial Key Performance Indicator (HiFi-KPI) dataset, a large-scale corpus of 1.65M paragraphs and 198k unique, hierarchically organized labels linked to iXBRL taxonomies. HiFi-KPI supports multiple tasks and we evaluate three: KPI classification, KPI extraction, and structured KPI extraction. For rapid evaluation, we also release HiFi-KPI-Lite, a manually curated 8K paragraph subset. Baselines on HiFi-KPI-Lite show that encoder-based models achieve over 0.906 macro-F1 on classification, while Large…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStock Market Forecasting Methods
