Transformer-Based Inpainting for Real-Time 3D Streaming in Sparse Multi-Camera Setups
Leif Van Holland, Domenic Zingsheim, Mana Takhsha, Hannah Dr\"oge, Patrick Stotko, Markus Plack, Reinhard Klein

TL;DR
This paper introduces a transformer-based inpainting method for real-time 3D streaming from multi-camera setups, ensuring consistent, high-quality textures in AR/VR applications with a focus on speed and adaptability.
Contribution
The paper presents a novel, multi-view aware transformer architecture with spatio-temporal embeddings for real-time, resolution-independent inpainting in multi-camera 3D streaming.
Findings
Outperforms state-of-the-art inpainting methods in quality and speed
Achieves real-time performance with adaptive patch selection
Ensures temporal and multi-view consistency in inpainted textures
Abstract
High-quality 3D streaming from multiple cameras is crucial for immersive experiences in many AR/VR applications. The limited number of views - often due to real-time constraints - leads to missing information and incomplete surfaces in the rendered images. Existing approaches typically rely on simple heuristics for the hole filling, which can result in inconsistencies or visual artifacts. We propose to complete the missing textures using a novel, application-targeted inpainting method independent of the underlying representation as an image-based post-processing step after the novel view rendering. The method is designed as a standalone module compatible with any calibrated multi-camera system. For this we introduce a multi-view aware, transformer-based network architecture using spatio-temporal embeddings to ensure consistency across frames while preserving fine details. Additionally,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging · Computer Graphics and Visualization Techniques
