Enhancing Science Classroom Discourse Analysis through Joint Multi-Task Learning for Reasoning-Component Classification
Jiho Noh, Mukhesh Raghava Katragadda, Raymond Carl, Soon Lee

TL;DR
This paper introduces ADAS, an automated system for analyzing classroom discourse by jointly classifying utterance types and reasoning components, leveraging data augmentation and advanced models to improve accuracy.
Contribution
The study presents a novel joint classification approach for discourse analysis, incorporating data augmentation and detailed pattern analysis to enhance understanding of classroom interactions.
Findings
LLM-based data augmentation improves minority class recognition.
Teacher feedback moves are strongly linked to student inferential reasoning.
Structural simplicity of reasoning component classification makes it accessible for lexical models.
Abstract
Analyzing the reasoning patterns of students in science classrooms is critical for understanding knowledge construction mechanism and improving instructional practice to maximize cognitive engagement, yet manual coding of classroom discourse at scale remains prohibitively labor-intensive. We present an automated discourse analysis system (ADAS) that jointly classifies teacher and student utterances along two complementary dimensions: Utterance Type and Reasoning Component derived from our prior CDAT framework. To address severe label imbalance among minority classes, we (1) stratify-resplit the annotated corpus, (2) apply LLM-based synthetic data augmentation targeting minority classes, and (3) train a dual-probe head RoBERTa-base classifier. A zero-shot GPT-5.4 baseline achieves macro-F1 of 0.467 on UT and 0.476 on RC, establishing meaningful upper bounds for prompt-only approaches…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
