
TL;DR
This paper presents a POMDP-based framework for generalized 3D object search in robotics, integrating human-world structures and interaction cues, and demonstrates its effectiveness across multiple robot platforms and environments.
Contribution
It introduces a practical, environment-agnostic system for 3D object search using POMDPs, exploiting structural and linguistic cues, and deploys it on various robots for real-world tasks.
Findings
Successful deployment on Boston Dynamics Spot robot finding objects in under a minute
Effective handling of occlusion, limited view, and noisy detectors in 3D environments
System generalizes across different robots and environments
Abstract
Future collaborative robots must be capable of finding objects. As such a fundamental skill, we expect object search to eventually become an off-the-shelf capability for any robot, similar to e.g., object detection, SLAM, and motion planning. However, existing approaches either make unrealistic compromises (e.g., reduce the problem from 3D to 2D), resort to ad-hoc, greedy search strategies, or attempt to learn end-to-end policies in simulation that are yet to generalize across real robots and environments. This thesis argues that through using Partially Observable Markov Decision Processes (POMDPs) to model object search while exploiting structures in the human world (e.g., octrees, correlations) and in human-robot interaction (e.g., spatial language), a practical and effective system for generalized object search can be achieved. In support of this argument, I develop methods and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques · Robotic Path Planning Algorithms
