PRISM: A Methodology for Auditing Biases in Large Language Models

Leif Azzopardi; Yashar Moshfeghi

arXiv:2410.18906·cs.CL·November 12, 2024

PRISM: A Methodology for Auditing Biases in Large Language Models

Leif Azzopardi, Yashar Moshfeghi

PDF

Open Access 1 Repo

TL;DR

PRISM is a novel inquiry-based methodology for indirectly auditing biases and preferences in large language models, revealing their political leanings and constraints more reliably than direct methods.

Contribution

The paper introduces PRISM, a flexible, task-based approach for auditing LLMs' biases, overcoming obfuscation and refusal issues in direct preference elicitation.

Findings

01

Most LLMs default to left-leaning, socially liberal positions.

02

Models vary in constraint and neutrality, with some being more compliant.

03

PRISM effectively uncovers biases and constraints in LLMs.

Abstract

Auditing Large Language Models (LLMs) to discover their biases and preferences is an emerging challenge in creating Responsible Artificial Intelligence (AI). While various methods have been proposed to elicit the preferences of such models, countermeasures have been taken by LLM trainers, such that LLMs hide, obfuscate or point blank refuse to disclosure their positions on certain subjects. This paper presents PRISM, a flexible, inquiry-based methodology for auditing LLMs - that seeks to illicit such positions indirectly through task-based inquiry prompting rather than direct inquiry of said preferences. To demonstrate the utility of the methodology, we applied PRISM on the Political Compass Test, where we assessed the political leanings of twenty-one LLMs from seven providers. We show LLMs, by default, espouse positions that are economically left and socially liberal (consistent with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cis-phawm/prism
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStock Market Forecasting Methods