Facial-Expression-Aware Prompting for Empathetic LLM Tutoring
Shuangquan Feng, Laura Fleig, Ruisen Tu, Philip Chi, Edmund Bu, Melinda Ozel, Junhua Ma, Teng Fei, Virginia R. de Sa

TL;DR
This paper demonstrates that integrating facial expression signals into LLM tutoring systems enhances empathetic responses without extensive retraining, using lightweight, structured cues like Action Units.
Contribution
It introduces a prompt-level method to incorporate facial expression cues into LLM tutoring, improving empathy without end-to-end model retraining.
Findings
AU-based conditioning improves empathetic responsiveness across models.
Peak-expression frame selection outperforms random facial frames.
Facial-expression-grounded empathy aligns better with human judgments.
Abstract
Large language models (LLMs) enable increasingly capable tutoring-style conversational agents, yet effective tutoring requires sensitivity to learners' affective and cognitive states beyond text alone. Facial expressions provide immediate and practical cues of confusion, frustration, or engagement, but remain underexplored in LLM-driven tutoring. We investigate whether facial-expression-aware signals can improve empathetic tutoring responses through prompt-level integration, without end-to-end retraining. We build a scalable simulated tutoring environment where a student agent exhibits diverse facial behaviors from a large unlabeled facial expression video dataset, and compare four tutor variants: a text-only LLM baseline, a multimodal baseline using a random facial frame, and two Action Unit estimation model (AUM)-based methods that either inject textual AU descriptions or select a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
