Resource-Efficient Multiview Perception: Integrating Semantic Masking with Masked Autoencoders
Kosta Dakic, Kanchana Thilakarathna, Rodrigo N. Calheiros, Teng Joon Lim

TL;DR
This paper introduces a semantic-guided masking strategy combined with masked autoencoders to improve communication efficiency in multiview perception systems, maintaining high detection and tracking accuracy while reducing data transmission.
Contribution
The novel semantic-guided masking approach integrated with MAEs enhances resource efficiency in multiview perception, outperforming random masking in accuracy and data reduction.
Findings
Achieves comparable detection and tracking performance at high masking ratios.
Reduces transmission data volume significantly compared to baseline methods.
Selective masking outperforms random masking in accuracy and efficiency.
Abstract
Multiview systems have become a key technology in modern computer vision, offering advanced capabilities in scene understanding and analysis. However, these systems face critical challenges in bandwidth limitations and computational constraints, particularly for resource-limited camera nodes like drones. This paper presents a novel approach for communication-efficient distributed multiview detection and tracking using masked autoencoders (MAEs). We introduce a semantic-guided masking strategy that leverages pre-trained segmentation models and a tunable power function to prioritize informative image regions. This approach, combined with an MAE, reduces communication overhead while preserving essential visual information. We evaluate our method on both virtual and real-world multiview datasets, demonstrating comparable performance in terms of detection and tracking performance metrics…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Visual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques
MethodsMasked autoencoder
