An Anytime Algorithm for Task and Motion MDPs
Siddharth Srivastava, Nishant Desai, Richard Freedman, Shlomo, Zilberstein

TL;DR
This paper introduces an anytime algorithm for task and motion Markov decision processes that efficiently computes high-quality, probabilistically complete policies for complex robotic tasks in stochastic environments.
Contribution
It presents a novel anytime approach for solving task and motion MDPs, enabling scalable, incremental policy computation in stochastic settings.
Findings
Effective computation of task and motion policies for autonomous aircraft inspection
Algorithm improves solution quality over time and guarantees probabilistic completeness
Reduces computational effort compared to full policy computation
Abstract
Integrated task and motion planning has emerged as a challenging problem in sequential decision making, where a robot needs to compute high-level strategy and low-level motion plans for solving complex tasks. While high-level strategies require decision making over longer time-horizons and scales, their feasibility depends on low-level constraints based upon the geometries and continuous dynamics of the environment. The hybrid nature of this problem makes it difficult to scale; most existing approaches focus on deterministic, fully observable scenarios. We present a new approach where the high-level decision problem occurs in a stochastic setting and can be modeled as a Markov decision process. In contrast to prior efforts, we show that complete MDP policies, or contingent behaviors, can be computed effectively in an anytime fashion. Our algorithm continuously improves the quality of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Path Planning Algorithms · Reinforcement Learning in Robotics · Robot Manipulation and Learning
