PTPARL-D: Annotated Corpus of 44 years of Portuguese Parliament debates
Paulo Almeida, Manuel Marques-Pita, Joana Gon\c{c}alves-S\'a

TL;DR
This paper introduces PTPARL-D, a comprehensive annotated corpus of 44 years of Portuguese Parliament debates, enhancing accessibility and analysis of parliamentary data for democratic transparency.
Contribution
The paper provides the first extensive annotated corpus of Portuguese parliamentary debates spanning over four decades, facilitating research and transparency.
Findings
Corpus covers 1976-2019 debates
Includes annotations for analysis
Enhances transparency and research accessibility
Abstract
In a representative democracy, some decide in the name of the rest, and these elected officials are commonly gathered in public assemblies, such as parliaments, where they discuss policies, legislate, and vote on fundamental initiatives. A core aspect of such democratic processes are the plenary debates, where important public discussions take place. Many parliaments around the world are increasingly keeping the transcripts of such debates, and other parliamentary data, in digital formats accessible to the public, increasing transparency and accountability. Furthermore, some parliaments are bringing old paper transcripts to semi-structured digital formats. However, these records are often only provided as raw text or even as images, with little to no annotation, and inconsistent formats, making them difficult to analyze and study, reducing both transparency and public reach. Here, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods
