Are Large Language Models Reliable Argument Quality Annotators?

Nailia Mirzakhmedova; Marcel Gohsen; Chia Hao Chang; Benno Stein

arXiv:2404.09696·cs.CL·April 16, 2024·1 cites

Are Large Language Models Reliable Argument Quality Annotators?

Nailia Mirzakhmedova, Marcel Gohsen, Chia Hao Chang, Benno Stein

PDF

Open Access 1 Repo

TL;DR

This paper investigates the use of large language models as automated annotators for argument quality, finding they can produce consistent assessments that align well with human experts, thereby improving annotation reliability.

Contribution

The study demonstrates that LLMs can effectively serve as argument quality annotators, enhancing annotation consistency and agreement with human experts.

Findings

01

LLMs show moderately high agreement with human experts on argument quality.

02

Using LLMs as annotators improves overall annotation agreement.

03

LLMs can streamline argument dataset evaluation processes.

Abstract

Evaluating the quality of arguments is a crucial aspect of any system leveraging argument mining. However, it is a challenge to obtain reliable and consistent annotations regarding argument quality, as this usually requires domain-specific expertise of the annotators. Even among experts, the assessment of argument quality is often inconsistent due to the inherent subjectivity of this task. In this paper, we study the potential of using state-of-the-art large language models (LLMs) as proxies for argument quality annotators. To assess the capability of LLMs in this regard, we analyze the agreement between model, human expert, and human novice annotators based on an established taxonomy of argument quality dimensions. Our findings highlight that LLMs can produce consistent annotations, with a moderately high agreement with human experts across most of the quality dimensions. Moreover, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

webis-de/ratio-24
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Software Engineering Research · Natural Language Processing Techniques