SegmATRon: Embodied Adaptive Semantic Segmentation for Indoor Environment
Tatiana Zemskova, Margarita Kichik, Dmitry Yudin, Aleksei Staroverov,, Aleksandr Panov

TL;DR
SegmATRon is an adaptive transformer model that improves indoor semantic segmentation by dynamically updating its weights during inference using multiple images, leveraging agent actions in simulated environments.
Contribution
This work introduces SegmATRon, a novel adaptive transformer that updates its weights during inference for improved semantic segmentation in indoor environments.
Findings
Using multiple images enhances segmentation quality.
Adaptive weight updating benefits indoor environment understanding.
Model performs well on Habitat and AI2-THOR datasets.
Abstract
This paper presents an adaptive transformer model named SegmATRon for embodied image semantic segmentation. Its distinctive feature is the adaptation of model weights during inference on several images using a hybrid multicomponent loss function. We studied this model on datasets collected in the photorealistic Habitat and the synthetic AI2-THOR Simulators. We showed that obtaining additional images using the agent's actions in an indoor environment can improve the quality of semantic segmentation. The code of the proposed approach and datasets are publicly available at https://github.com/wingrune/SegmATRon.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
