ParliaBench: An Evaluation and Benchmarking Framework for LLM-Generated Parliamentary Speech
Marios Koniaris, Argyro Tsipi, Panayiotis Tsanakas

TL;DR
ParliaBench introduces a comprehensive evaluation framework and dataset for assessing the quality and political authenticity of parliamentary speech generation by large language models, addressing a niche yet critical application.
Contribution
The paper presents ParliaBench, a new benchmark with novel political alignment metrics and a dataset of UK Parliament speeches for training and evaluating LLMs in parliamentary contexts.
Findings
Fine-tuning improves speech quality and political authenticity.
Novel metrics effectively measure ideological positioning.
Benchmark enables systematic evaluation of LLMs in political speech generation.
Abstract
Parliamentary speech generation presents specific challenges for large language models beyond standard text generation tasks. Unlike general text generation, parliamentary speeches require not only linguistic quality but also political authenticity and ideological consistency. Current language models lack specialized training for parliamentary contexts, and existing evaluation methods focus on standard NLP metrics rather than political authenticity. To address this, we present ParliaBench, a benchmark for parliamentary speech generation. We constructed a dataset of speeches from UK Parliament to enable systematic model training. We introduce an evaluation framework combining computational metrics with LLM-as-a-judge assessments for measuring generation quality across three dimensions: linguistic quality, semantic coherence, and political authenticity. We propose two novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Sentiment Analysis and Opinion Mining · Topic Modeling
