ALIVE: An Avatar-Lecture Interactive Video Engine with Content-Aware Retrieval for Real-Time Interaction
Md Zabirul Islam, Md Motaleb Hossen Manik, Ge Wang

TL;DR
ALIVE is a local, real-time interactive video engine that enhances passive lecture viewing by integrating content-aware retrieval, avatar-delivered explanations, and multimodal student interaction, all operating fully on local hardware.
Contribution
It introduces a unified, privacy-preserving system combining neural avatars, content-aware retrieval, and multimodal interaction for real-time lecture engagement.
Findings
Demonstrates accurate retrieval and low latency in a medical imaging course
Provides engaging, grounded explanations via avatar responses
Operates fully on local hardware, ensuring privacy and responsiveness
Abstract
Traditional lecture videos offer flexibility but lack mechanisms for real-time clarification, forcing learners to search externally when confusion arises. Recent advances in large language models and neural avatars provide new opportunities for interactive learning, yet existing systems typically lack lecture awareness, rely on cloud-based services, or fail to integrate retrieval and avatar-delivered explanations in a unified, privacy-preserving pipeline. We present ALIVE, an Avatar-Lecture Interactive Video Engine that transforms passive lecture viewing into a dynamic, real-time learning experience. ALIVE operates fully on local hardware and integrates (1) Avatar-delivered lecture generated through ASR transcription, LLM refinement, and neural talking-head synthesis; (2) A content-aware retrieval mechanism that combines semantic similarity with timestamp alignment to surface…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Emotion and Mood Recognition · Intelligent Tutoring Systems and Adaptive Learning
