Automatic Classification of News Subjects in Broadcast News: Application   to a Gender Bias Representation Analysis

Valentin Pelloin; Lena Dodson; \'Emile Chapuis; Nicolas Herv\'e; David; Doukhan

arXiv:2407.14180·cs.CL·July 22, 2024

Automatic Classification of News Subjects in Broadcast News: Application to a Gender Bias Representation Analysis

Valentin Pelloin, Lena Dodson, \'Emile Chapuis, Nicolas Herv\'e, David, Doukhan

PDF

1 Repo

TL;DR

This study develops a computational framework using LLMs and smaller models to classify news topics and analyze gender bias in French broadcast news, revealing underrepresentation of women in certain subjects.

Contribution

It introduces a novel approach combining LLMs and fine-tuning for topic classification and gender bias analysis in broadcast news datasets.

Findings

01

Women underrepresented in sports, politics, conflicts

02

Women have more speaking time in weather, commercials, health

03

Representation varies between private and public channels

Abstract

This paper introduces a computational framework designed to delineate gender distribution biases in topics covered by French TV and radio news. We transcribe a dataset of 11.7k hours, broadcasted in 2023 on 21 French channels. A Large Language Model (LLM) is used in few-shot conversation mode to obtain a topic classification on those transcriptions. Using the generated LLM annotations, we explore the finetuning of a specialized smaller classification model, to reduce the computational cost. To evaluate the performances of these models, we construct and annotate a dataset of 804 dialogues. This dataset is made available free of charge for research purposes. We show that women are notably underrepresented in subjects such as sports, politics and conflicts. Conversely, on topics such as weather, commercials and health, women have more speaking time than their overall average across all…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ina-foss/is24_news_topic
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Methodstravel james