Analyzing Large Language Models for Classroom Discussion Assessment

Nhat Tran; Benjamin Pierce; Diane Litman; Richard Correnti; Lindsay; Clare Matsumura

arXiv:2406.08680·cs.CL·June 14, 2024·1 cites

Analyzing Large Language Models for Classroom Discussion Assessment

Nhat Tran, Benjamin Pierce, Diane Litman, Richard Correnti, Lindsay, Clare Matsumura

PDF

Open Access 1 Repo

TL;DR

This paper evaluates how large language models can assess classroom discussions, analyzing the impact of task formulation, context length, and few-shot examples on performance, and balancing accuracy with efficiency and consistency.

Contribution

It provides an empirical analysis of factors influencing LLM-based assessment performance and offers recommendations for effective, efficient, and consistent evaluation methods.

Findings

01

Task formulation affects assessment accuracy.

02

Context length influences model performance.

03

Consistency correlates with predictive accuracy.

Abstract

Automatically assessing classroom discussion quality is becoming increasingly feasible with the help of new NLP advancements such as large language models (LLMs). In this work, we examine how the assessment performance of 2 LLMs interacts with 3 factors that may affect performance: task formulation, context length, and few-shot examples. We also explore the computational efficiency and predictive consistency of the 2 LLMs. Our results suggest that the 3 aforementioned factors do affect the performance of the tested LLMs and there is a relation between consistency and performance. We recommend a LLM-based assessment approach that has a good balance in terms of predictive performance, computational efficiency, and consistency.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nhattlm95/LLM_for_Classroom_Discussion
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEducational Technology and Assessment