LLMs left, right, and center: Assessing GPT's capabilities to label political bias from web domains
Raphael Hernandes, Giulio Corsi

TL;DR
This study evaluates GPT-4's ability to classify political bias of news sources from URLs, finding high correlation with human ratings but noting limitations in abstention rates and polarization biases.
Contribution
It demonstrates GPT-4's potential as a scalable tool for political bias classification, highlighting its strengths and limitations compared to human assessments.
Findings
High correlation ($\rho = .89$) with human bias ratings
GPT-4 abstains from about two-thirds of classifications
Tends to produce more polarized and slightly left-leaning outputs
Abstract
This research investigates whether OpenAI's GPT-4, a state-of-the-art large language model, can accurately classify the political bias of news sources based solely on their URLs. Given the subjective nature of political labels, third-party bias ratings like those from Ad Fontes Media, AllSides, and Media Bias/Fact Check (MBFC) are often used in research to analyze news source diversity. This study aims to determine if GPT-4 can replicate these human ratings on a seven-degree scale ("far-left" to "far-right"). The analysis compares GPT-4's classifications against MBFC's, and controls for website popularity using Open PageRank scores. Findings reveal a high correlation (, , ) between GPT-4's and MBFC's ratings, indicating the model's potential reliability. However, GPT-4 abstained from classifying approximately of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Judicial and Constitutional Studies
MethodsAttention Is All You Need · Byte Pair Encoding · Layer Normalization · Label Smoothing · Linear Layer · Softmax · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Multi-Head Attention · Dense Connections
