Asynchronous Pipeline Parallelism for Real-Time Multilingual Lip Synchronization in Video Communication Systems
Eren Caglar, Amirkia Rafiei Oskooei, Mehmet Kutanoglu, Mustafa Keles, and Mehmet S. Aktas

TL;DR
This paper presents a novel asynchronous pipeline framework for real-time multilingual lip synchronization in video conferencing, significantly reducing latency and improving efficiency through optimized module execution and semantic speech segmentation.
Contribution
It introduces a parallel, asynchronous Transformer-based architecture with message-queue decoupling, optimized inference workflows, and context-aware silence detection for enhanced real-time lip synchronization.
Findings
Reduces end-to-end latency by up to 3.1 times
Improves processing speed and resource utilization
Maintains high translation accuracy and visual quality
Abstract
This paper introduces a parallel and asynchronous Transformer framework designed for efficient and accurate multilingual lip synchronization in real-time video conferencing systems. The proposed architecture integrates translation, speech processing, and lip-synchronization modules within a pipeline-parallel design that enables concurrent module execution through message-queue-based decoupling, reducing end-to-end latency by up to 3.1 times compared to sequential approaches. To enhance computational efficiency and throughput, the inference workflow of each module is optimized through low-level graph compilation, mixed-precision quantization, and hardware-accelerated kernel fusion. These optimizations provide substantial gains in efficiency while preserving model accuracy and visual quality. In addition, a context-adaptive silence-detection component segments the input speech stream at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Multimedia Communication and Technology · Digital Filter Design and Implementation
