TL;DR
Re:Member is a system that uses personal memories and emotional speech synthesis to enhance engagement and affective recall in second language learning through stylized, memory-grounded interactions.
Contribution
It introduces a novel system that combines personal media, emotional speech styles, and multimodal alignment to support affective and engaging language learning experiences.
Findings
Effective emotional alignment with visual context achieved
Enhanced learner engagement demonstrated through stylized interactions
Memory-grounded questions improve affective recall in L2 learners
Abstract
We present Re:Member, a system that explores how emotionally expressive, memory-grounded interaction can support more engaging second language (L2) learning. By drawing on users' personal videos and generating stylized spoken questions in the target language, Re:Member is designed to encourage affective recall and conversational engagement. The system aligns emotional tone with visual context, using expressive speech styles such as whispers or late-night tones to evoke specific moods. It combines WhisperX-based transcript alignment, 3-frame visual sampling, and Style-BERT-VITS2 for emotional synthesis within a modular generation pipeline. Designed as a stylized interaction probe, Re:Member highlights the role of affect and personal media in learner-centered educational technologies.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
