AI-Driven Modular Services for Accessible Multilingual Education in Immersive Extended Reality Settings: Integrating Speech Processing, Translation, and Sign Language Rendering
N.D. Tantaroudas, A.J. McCracken, I. Karachalios, E. Papatheou

TL;DR
This paper presents a modular AI platform integrating speech recognition, translation, sign language rendering, and emotion analysis for accessible multilingual education in immersive XR environments, validated through technical benchmarking.
Contribution
It introduces a novel modular system combining multiple AI services for real-time, accessible multilingual XR education, with comprehensive technical validation and benchmarking.
Findings
AWS Polly offers lowest latency for speech synthesis
EuroLLM 1.7B achieves higher BLEU score than NLLB
Platform is suitable for real-time XR deployment
Abstract
This work introduces a modular platform that brings together six AI services, automatic speech recognition via OpenAI Whisper, multilingual translation through Meta NLLB, speech synthesis using AWS Polly, emotion classification with RoBERTa, dialogue summarisation via flan t5 base samsum, and International Sign (IS) rendering through Google MediaPipe. A corpus of IS gesture recordings was processed to derive hand landmark coordinates, which were subsequently mapped onto three dimensional avatar animations inside a virtual reality (VR) environment. Validation comprised technical benchmarking of each AI component, including comparative assessments of speech synthesis providers and multilingual translation models (NLLB 200 and EuroLLM 1.7B variants). Technical evaluations confirmed the suitability of the platform for real time XR deployment. Speech synthesis benchmarking established that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
