Automated Long Answer Grading with RiceChem Dataset
Shashank Sonkar, Kangqi Ni, Lesa Tran Lu, Kristi Kincaid, John S., Hutchinson, Richard G. Baraniuk

TL;DR
This paper introduces RiceChem, a new dataset for automated long answer grading in education, and proposes a rubric entailment approach using natural language inference models, demonstrating its effectiveness and highlighting the task's complexity.
Contribution
The paper presents RiceChem, a novel dataset for ALAG, and introduces a rubric entailment formulation leveraging transfer learning from MNLI, advancing automated long answer assessment.
Findings
Rubric-based formulation outperforms traditional scoring methods.
Transfer learning from MNLI improves model performance on RiceChem.
LLMs show lower performance on ALAG, indicating the task's complexity.
Abstract
We introduce a new area of study in the field of educational Natural Language Processing: Automated Long Answer Grading (ALAG). Distinguishing itself from Automated Short Answer Grading (ASAG) and Automated Essay Grading (AEG), ALAG presents unique challenges due to the complexity and multifaceted nature of fact-based long answers. To study ALAG, we introduce RiceChem, a dataset derived from a college chemistry course, featuring real student responses to long-answer questions with an average word count notably higher than typical ASAG datasets. We propose a novel approach to ALAG by formulating it as a rubric entailment problem, employing natural language inference models to verify whether each criterion, represented by a rubric item, is addressed in the student's response. This formulation enables the effective use of MNLI for transfer learning, significantly improving the performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEducational Technology and Assessment · Traditional Chinese Medicine Studies
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Dense Connections · Residual Connection · Softmax · Adam · Layer Normalization · Attention Dropout · Linear Layer · Multi-Head Attention
