Impact of Multimodal and Conversational AI on Learning Outcomes and Experience
Karan Taneja, Anjali Singh, Ashok K. Goel

TL;DR
This study investigates how multimodal and conversational AI systems affect learning outcomes in STEM education, finding that multimodal, conversational AI enhances learning and experience, but perceptions may not always align with actual learning gains.
Contribution
It provides empirical evidence on the combined effects of multimodality and conversationality in AI-based learning tools, highlighting their impact on cognitive load and learning effectiveness.
Findings
MuDoC led to highest post-test scores and positive experiences.
TexDoC was more engaging but resulted in lower test scores.
Multimodality increases germane load, improving learning outcomes.
Abstract
Multimodal Large Language Models (MLLMs) offer an opportunity to support multimedia learning through conversational systems grounded in educational content. However, while conversational AI is known to boost engagement, its impact on learning in visually-rich STEM domains remains under-explored. Moreover, there is limited understanding of how multimodality and conversationality jointly influence learning in generative AI systems. This work reports findings from a randomized controlled online study (N = 124) comparing three approaches to learning biology from textbook content: (1) a document-grounded conversational AI with interleaved text-and-image responses (MuDoC), (2) a document-grounded conversational AI with text-only responses (TexDoC), and (3) a textbook interface with semantic search and highlighting (DocSearch). Learners using MuDoC achieved the highest post-test scores and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
