Evaluation of an LLM in Identifying Logical Fallacies: A Call for Rigor   When Adopting LLMs in HCI Research

Gionnieve Lim; Simon T. Perrault

arXiv:2404.05213·cs.HC·April 9, 2024·1 cites

Evaluation of an LLM in Identifying Logical Fallacies: A Call for Rigor When Adopting LLMs in HCI Research

Gionnieve Lim, Simon T. Perrault

PDF

Open Access

TL;DR

This paper critically evaluates GPT-4's ability to identify logical fallacies in the context of digital misinformation, emphasizing the need for rigorous assessment before deploying LLMs in HCI research.

Contribution

It presents a systematic evaluation of GPT-4's accuracy in detecting logical fallacies, highlighting the importance of critical adoption of LLMs in HCI applications.

Findings

01

GPT-4 achieves 0.79 accuracy overall

02

Excluding invalid instances, accuracy rises to 0.90

03

Evaluation approach and reflections provided

Abstract

There is increasing interest in the adoption of LLMs in HCI research. However, LLMs may often be regarded as a panacea because of their powerful capabilities with an accompanying oversight on whether they are suitable for their intended tasks. We contend that LLMs should be adopted in a critical manner following rigorous evaluation. Accordingly, we present the evaluation of an LLM in identifying logical fallacies that will form part of a digital misinformation intervention. By comparing to a labeled dataset, we found that GPT-4 achieves an accuracy of 0.79, and for our intended use case that excludes invalid or unidentified instances, an accuracy of 0.90. This gives us the confidence to proceed with the application of the LLM while keeping in mind the areas where it still falls short. The paper describes our evaluation approach, results and reflections on the use of the LLM for our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Software Engineering Research · Advanced Text Analysis Techniques

MethodsAttention Is All You Need · Softmax · Linear Layer · Layer Normalization · Dense Connections · Label Smoothing · Residual Connection · Dropout · Multi-Head Attention · Adam