Can Large Language Models Differentiate Harmful from Argumentative Essays? Steps Toward Ethical Essay Scoring
Hongjin Kim, Jeonghyun Kang, Harksoo Kim

TL;DR
This paper evaluates the ability of Large Language Models to detect harmful content in essays, highlighting current limitations and emphasizing the need for ethically aware automated scoring systems.
Contribution
Introduces the Harmful Essay Detection benchmark to assess LLMs' effectiveness in recognizing ethically problematic content in essays.
Findings
LLMs need improvement to distinguish harmful from argumentative essays
Current AES models often overlook ethical considerations in scoring
Highlighting the importance of ethical sensitivity in automated essay scoring
Abstract
This study addresses critical gaps in Automated Essay Scoring (AES) systems and Large Language Models (LLMs) with regard to their ability to effectively identify and score harmful essays. Despite advancements in AES technology, current models often overlook ethically and morally problematic elements within essays, erroneously assigning high scores to essays that may propagate harmful opinions. In this study, we introduce the Harmful Essay Detection (HED) benchmark, which includes essays integrating sensitive topics such as racism and gender bias, to test the efficacy of various LLMs in recognizing and scoring harmful content. Our findings reveal that: (1) LLMs require further enhancement to accurately distinguish between harmful and argumentative essays, and (2) both current AES models and LLMs fail to consider the ethical dimensions of content during scoring. The study underscores the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Topic Modeling · Explainable Artificial Intelligence (XAI)
