Adaptable Moral Stances of Large Language Models on Sexist Content:   Implications for Society and Gender Discourse

Rongchen Guo; Isar Nejadgholi; Hillary Dawkins; Kathleen C. Fraser,; Svetlana Kiritchenko

arXiv:2410.00175·cs.CL·October 2, 2024

Adaptable Moral Stances of Large Language Models on Sexist Content: Implications for Society and Gender Discourse

Rongchen Guo, Isar Nejadgholi, Hillary Dawkins, Kathleen C. Fraser,, Svetlana Kiritchenko

PDF

Open Access 1 Datasets 1 Video

TL;DR

This study examines how large language models can both critique and defend sexist content, revealing their diverse moral reasoning and ideological perspectives, and emphasizing the need for careful monitoring and safety measures.

Contribution

It provides a comprehensive analysis of LLMs' moral reasoning on sexism, highlighting their potential for understanding and addressing societal biases, and warns against misuse.

Findings

01

All models produce relevant explanations for sexist content.

02

Models display diverse ideological perspectives on gender roles.

03

Potential misuse of models to justify sexist language is identified.

Abstract

This work provides an explanatory view of how LLMs can apply moral reasoning to both criticize and defend sexist language. We assessed eight large language models, all of which demonstrated the capability to provide explanations grounded in varying moral perspectives for both critiquing and endorsing views that reflect sexist assumptions. With both human and automatic evaluation, we show that all eight models produce comprehensible and contextually relevant text, which is helpful in understanding diverse views on how sexism is perceived. Also, through analysis of moral foundations cited by LLMs in their arguments, we uncover the diverse ideological perspectives in models' outputs, with some models aligning more with progressive or conservative views on gender roles and sexism. Based on our observations, we caution against the potential misuse of LLMs to justify sexist language. We also…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

mft-moral/edos-sup
dataset· 33 dl
33 dl

Videos

Adaptable Moral Stances of Large Language Models on Sexist Content: Implications for Society and Gender Discourse· underline

Taxonomy

TopicsGender Politics and Representation