SegmATRon: Embodied Adaptive Semantic Segmentation for Indoor   Environment

Tatiana Zemskova; Margarita Kichik; Dmitry Yudin; Aleksei Staroverov,; Aleksandr Panov

arXiv:2310.12031·cs.CV·October 19, 2023·1 cites

SegmATRon: Embodied Adaptive Semantic Segmentation for Indoor Environment

Tatiana Zemskova, Margarita Kichik, Dmitry Yudin, Aleksei Staroverov,, Aleksandr Panov

PDF

Open Access 1 Repo

TL;DR

SegmATRon is an adaptive transformer model that improves indoor semantic segmentation by dynamically updating its weights during inference using multiple images, leveraging agent actions in simulated environments.

Contribution

This work introduces SegmATRon, a novel adaptive transformer that updates its weights during inference for improved semantic segmentation in indoor environments.

Findings

01

Using multiple images enhances segmentation quality.

02

Adaptive weight updating benefits indoor environment understanding.

03

Model performs well on Habitat and AI2-THOR datasets.

Abstract

This paper presents an adaptive transformer model named SegmATRon for embodied image semantic segmentation. Its distinctive feature is the adaptation of model weights during inference on several images using a hybrid multicomponent loss function. We studied this model on datasets collected in the photorealistic Habitat and the synthetic AI2-THOR Simulators. We showed that obtaining additional images using the agent's actions in an indoor environment can improve the quality of semantic segmentation. The code of the proposed approach and datasets are publicly available at https://github.com/wingrune/SegmATRon.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wingrune/segmatron
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning