RDBE: Reasoning Distillation-Based Evaluation Enhances Automatic Essay Scoring
Ali Ghiasvand Mohammadkhani

TL;DR
RDBE introduces a reasoning distillation approach that improves automatic essay scoring by providing interpretability and surpassing existing models in accuracy.
Contribution
It presents a novel method combining reasoning distillation with interpretability, outperforming baseline models in automatic essay scoring tasks.
Findings
RDBE outperforms zero-shot LLM and baseline models.
It achieves state-of-the-art results on the dataset.
Provides interpretability in essay scoring.
Abstract
Recently, various encoder-only and encoder-decoder pre-trained models like BERT and T5 have been applied to automatic essay scoring (AES) as small language models. However, existing studies have primarily treated this task akin to a classification problem, focusing solely on outputting scores in the target text without offering interpretations for the generated scores. Departing from the approaches, we introduce Reasoning Distillation-Based Evaluation (RDBE), which integrates interpretability to elucidate the rationale behind model scores while enhancing performance through initial reasoning. This interpretive capability is acquired during training by leveraging generated reasoning from a large language model (LLM) to distill a small language model (SLM). Our experimental results demonstrate the efficacy of RDBE across all scoring rubrics considered in the dataset. RDBE outperforms both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Topic Modeling · Innovative Teaching and Learning Methods
MethodsGated Linear Unit · Refunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Byte Pair Encoding · Linear Warmup With Linear Decay · Inverse Square Root Schedule · SentencePiece · Dropout · WordPiece · Adam
