MiNER: A Two-Stage Pipeline for Metadata Extraction from Municipal Meeting Minutes
Rodrigo Batista, Lu\'is Filipe Cunha, Purifica\c{c}\~ao Silvano, Nuno Guimar\~aes, Al\'ipio Jorge, Evelin Amorim, Ricardo Campos

TL;DR
This paper introduces a two-stage pipeline utilizing question answering and transformer-based models for extracting metadata from heterogeneous municipal meeting minutes, establishing a new benchmark for this task.
Contribution
It presents the first benchmark for metadata extraction from municipal minutes, combining QA and fine-grained entity recognition with evaluation of LLMs' performance, cost, and carbon footprint.
Findings
Strong in-domain performance surpassing larger general-purpose LLMs
Reduced cross-municipality generalization due to variability in records
Benchmarking of open-weight and closed-weight LLMs for this task
Abstract
Municipal meeting minutes are official documents of local governance, exhibiting heterogeneous formats and writing styles. Effective information retrieval (IR) requires identifying metadata such as meeting number, date, location, participants, and start/end times, elements that are rarely standardized or easy to extract automatically. Existing named entity recognition (NER) models are ill-suited to this task, as they are not adapted to such domain-specific categories. In this paper, we propose a two-stage pipeline for metadata extraction from municipal minutes. First, a question answering (QA) model identifies the opening and closing text segments containing metadata. Transformer-based models (BERTimbau and XLM-RoBERTa with and without a CRF layer) are then applied for fine-grained entity extraction and enhanced through deslexicalization. To evaluate our proposed pipeline, we benchmark…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
