Investigating Automatic Scoring and Feedback using Large Language Models

Gloria Ashiya Katuka; Alexander Gain; Yen-Yun Yu

arXiv:2405.00602·cs.CL·May 2, 2024·3 cites

Investigating Automatic Scoring and Feedback using Large Language Models

Gloria Ashiya Katuka, Alexander Gain, Yen-Yun Yu

PDF

Open Access

TL;DR

This paper demonstrates that quantized large language models, fine-tuned with PEFT methods, can accurately score and generate feedback for short answers and essays, reducing computational costs and latency.

Contribution

It introduces the use of PEFT-based quantized LLMs for automatic grading and feedback, showing high accuracy and efficiency in these tasks.

Findings

01

Prediction error less than 3% on grade scores

02

Quantized LLaMA-2 13B outperforms base models in feedback quality

03

Effective for both proprietary and open-source datasets

Abstract

Automatic grading and feedback have been long studied using traditional machine learning and deep learning techniques using language models. With the recent accessibility to high performing large language models (LLMs) like LLaMA-2, there is an opportunity to investigate the use of these LLMs for automatic grading and feedback generation. Despite the increase in performance, LLMs require significant computational resources for fine-tuning and additional specific adjustments to enhance their performance for such tasks. To address these issues, Parameter Efficient Fine-tuning (PEFT) methods, such as LoRA and QLoRA, have been adopted to decrease memory and computational requirements in model fine-tuning. This paper explores the efficacy of PEFT-based quantized models, employing classification or regression head, to fine-tune LLMs for automatically assigning continuous numerical grades to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsBalanced Selection