Grade Like a Human: Rethinking Automated Assessment with Large Language   Models

Wenjing Xie; Juxin Niu; Chun Jason Xue; Nan Guan

arXiv:2405.19694·cs.AI·May 31, 2024·6 cites

Grade Like a Human: Rethinking Automated Assessment with Large Language Models

Wenjing Xie, Juxin Niu, Chun Jason Xue, Nan Guan

PDF

Open Access 1 Repo

TL;DR

This paper presents a comprehensive LLM-based automated grading system that improves the entire grading process, including rubric design, scoring, and review, demonstrating effectiveness on new and existing datasets.

Contribution

It introduces a holistic approach to automated grading with LLMs, covering rubric creation, scoring, and post-review, which is a novel advancement over prior methods focusing only on scoring.

Findings

01

Effective grading accuracy on new OS dataset

02

Improved consistency and fairness in scoring

03

Insights into LLM capabilities for comprehensive grading

Abstract

While large language models (LLMs) have been used for automated grading, they have not yet achieved the same level of performance as humans, especially when it comes to grading complex questions. Existing research on this topic focuses on a particular step in the grading procedure: grading using predefined rubrics. However, grading is a multifaceted procedure that encompasses other crucial steps, such as grading rubrics design and post-grading review. There has been a lack of systematic research exploring the potential of LLMs to enhance the entire grading~process. In this paper, we propose an LLM-based grading system that addresses the entire grading procedure, including the following key components: 1) Developing grading rubrics that not only consider the questions but also the student answers, which can more accurately reflect students' performance. 2) Under the guidance of grading…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wenjing1170/llm_grader
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIntelligent Tutoring Systems and Adaptive Learning