Embodied Domain Adaptation for Object Detection
Xiangyu Shi, Yanyuan Qiao, Lingqiao Liu, Feras Dayoub

TL;DR
This paper presents EDAOD, a source-free domain adaptation method for object detection in indoor environments, improving zero-shot detection and adaptability to changing conditions using novel techniques like pseudo label refinement and contrastive learning.
Contribution
The paper introduces EDAOD, a new source-free domain adaptation framework for object detection that leverages temporal clustering, multi-scale fusion, and a Mean Teacher model with contrastive learning.
Findings
Significant improvements in zero-shot detection accuracy.
Effective adaptation to lighting, layout, and object diversity changes.
Benchmark results demonstrating robustness in dynamic indoor scenarios.
Abstract
Mobile robots rely on object detectors for perception and object localization in indoor environments. However, standard closed-set methods struggle to handle the diverse objects and dynamic conditions encountered in real homes and labs. Open-vocabulary object detection (OVOD), driven by Vision Language Models (VLMs), extends beyond fixed labels but still struggles with domain shifts in indoor environments. We introduce a Source-Free Domain Adaptation (SFDA) approach that adapts a pre-trained model without accessing source data. We refine pseudo labels via temporal clustering, employ multi-scale threshold fusion, and apply a Mean Teacher framework with contrastive learning. Our Embodied Domain Adaptation for Object Detection (EDAOD) benchmark evaluates adaptation under sequential changes in lighting, layout, and object diversity. Our experiments show significant gains in zero-shot…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications
