On-Demand Instructional Material Providing Agent Based on MLLM for Tutoring Support
Takumi Kato, Masato Kikuchi, Tadachika Ozono

TL;DR
This paper presents an AI-powered agent that analyzes tutoring dialogue to automatically retrieve relevant instructional images, significantly reducing retrieval time and maintaining high-quality support during tutoring sessions.
Contribution
The study introduces a multimodal large language model-based agent that automatically generates search queries and retrieves images in real-time during tutoring, enhancing instructional support.
Findings
Reduces image retrieval time by 44.4 seconds
Provides acceptable quality images in 85.7% of trials
Supports instructors effectively during tutoring sessions
Abstract
Effective instruction in tutoring requires promptly providing instructional materials that match the needs of each student (e.g., in response to questions). In this study, we introduce an agent that automatically delivers supplementary materials on demand during one-on-one tutoring sessions. Our agent uses a multimodal large language model to analyze spoken dialogue between the instructor and the student, automatically generate search queries, and retrieve relevant Web images. Evaluation experiments demonstrate that our agent reduces the average image retrieval time by 44.4 s compared to cases without support and successfully provides images of acceptable quality in 85.7% of trials. These results indicate that our agent effectively supports instructors during tutoring sessions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Multimodal Machine Learning Applications · Topic Modeling
