On the Reliability of Large Language Models to Misinformed and   Demographically-Informed Prompts

Toluwani Aremu; Oluwakemi Akinwehinmi; Chukwuemeka Nwagu; Syed; Ishtiaque Ahmed; Rita Orji; Pedro Arnau Del Amo; Abdulmotaleb El Saddik

arXiv:2410.10850·cs.CL·October 18, 2024

On the Reliability of Large Language Models to Misinformed and Demographically-Informed Prompts

Toluwani Aremu, Oluwakemi Akinwehinmi, Chukwuemeka Nwagu, Syed, Ishtiaque Ahmed, Rita Orji, Pedro Arnau Del Amo, Abdulmotaleb El Saddik

PDF

Open Access 1 Repo

TL;DR

This paper evaluates the reliability of large language models in handling misinformed and demographic prompts, highlighting their strengths and ethical concerns in sensitive domains like climate change and mental health.

Contribution

It provides a comprehensive analysis of LLM performance on factual accuracy and bias in sensitive topics, emphasizing the need for ethical oversight and refinement.

Findings

01

LLMs can reliably answer factual True/False questions.

02

Concerns about privacy, ethics, and user guidance remain.

03

Expert insights highlight areas for improvement.

Abstract

We investigate and observe the behaviour and performance of Large Language Model (LLM)-backed chatbots in addressing misinformed prompts and questions with demographic information within the domains of Climate Change and Mental Health. Through a combination of quantitative and qualitative methods, we assess the chatbots' ability to discern the veracity of statements, their adherence to facts, and the presence of bias or misinformation in their responses. Our quantitative analysis using True/False questions reveals that these chatbots can be relied on to give the right answers to these close-ended questions. However, the qualitative insights, gathered from domain experts, shows that there are still concerns regarding privacy, ethical implications, and the necessity for chatbots to direct users to professional services. We conclude that while these chatbots hold significant promise, their…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tolusophy/Edge-of-Tomorrow
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling