GAPLE: Generalizable Approaching Policy LEarning for Robotic Object Searching in Indoor Environment
Xin Ye, Zhe Lin, Joon-Young Lee, Jianming Zhang, Shibin Zheng and, Yezhou Yang

TL;DR
This paper introduces GAPLE, a novel approach for training a generalizable policy for robotic object searching in indoor environments, leveraging depth and semantic segmentation features to improve transferability across different scenes.
Contribution
GAPLE is a new method that uses depth and semantic features to enable robots to actively approach objects with better generalization across environments.
Findings
Validated on House3D dataset and real-world platform
Demonstrated improved generalization over scene-specific methods
Provided qualitative analysis of approach behaviors
Abstract
We study the problem of learning a generalizable action policy for an intelligent agent to actively approach an object of interest in an indoor environment solely from its visual inputs. While scene-driven or recognition-driven visual navigation has been widely studied, prior efforts suffer severely from the limited generalization capability. In this paper, we first argue the object searching task is environment dependent while the approaching ability is general. To learn a generalizable approaching policy, we present a novel solution dubbed as GAPLE which adopts two channels of visual features: depth and semantic segmentation, as the inputs to the policy learning module. The empirical studies conducted on the House3D dataset as well as on a physical platform in a real world scenario validate our hypothesis, and we further provide in-depth qualitative analysis.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
