Global-Liar: Factuality of LLMs over Time and Geographic Regions

Shujaat Mirza; Bruno Coelho; Yuyuan Cui; Christina P\"opper; Damon; McCoy

arXiv:2401.17839·cs.CL·February 1, 2024·1 cites

Global-Liar: Factuality of LLMs over Time and Geographic Regions

Shujaat Mirza, Bruno Coelho, Yuyuan Cui, Christina P\"opper, Damon, McCoy

PDF

Open Access

TL;DR

This study evaluates the factual accuracy, biases, and stability of GPT models across regions and time, introducing a balanced dataset and analyzing how configurations affect reliability, highlighting disparities and areas for improvement.

Contribution

The paper introduces 'Global-Liar,' a novel dataset for evaluating LLM biases across regions and time, and provides insights into how model updates and configurations impact factuality and fairness.

Findings

01

GPT-4 (March) outperforms later versions in factual accuracy.

02

Models biased towards Global North, disadvantaging regions like Africa and Middle East.

03

Binary decision constraints reduce factuality.

Abstract

The increasing reliance on AI-driven solutions, particularly Large Language Models (LLMs) like the GPT series, for information retrieval highlights the critical need for their factuality and fairness, especially amidst the rampant spread of misinformation and disinformation online. Our study evaluates the factual accuracy, stability, and biases in widely adopted GPT models, including GPT-3.5 and GPT-4, contributing to reliability and integrity of AI-mediated information dissemination. We introduce 'Global-Liar,' a dataset uniquely balanced in terms of geographic and temporal representation, facilitating a more nuanced evaluation of LLM biases. Our analysis reveals that newer iterations of GPT models do not always equate to improved performance. Notably, the GPT-4 version from March demonstrates higher factual accuracy than its subsequent June release. Furthermore, a concerning bias is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInternational Arbitration and Investment Law

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Cosine Annealing · Linear Layer · Discriminative Fine-Tuning · Dense Connections · Position-Wise Feed-Forward Layer · Label Smoothing · Absolute Position Encodings