Robot Planning and Situation Handling with Active Perception

Austine Oloo; Zainab Altaweel; Yohei Hayamizu; Peiqi Liu; Yan Ding; Saeid Amiri; Hao Yang; Andy Kaminski; Chad Esselink; Chris Paxton; Xiaohan Zhang; Shiqi Zhang

arXiv:2604.26988·cs.RO·May 1, 2026

Robot Planning and Situation Handling with Active Perception

Austine Oloo, Zainab Altaweel, Yohei Hayamizu, Peiqi Liu, Yan Ding, Saeid Amiri, Hao Yang, Andy Kaminski, Chad Esselink, Chris Paxton, Xiaohan Zhang, Shiqi Zhang

PDF

TL;DR

This paper introduces VAP-TAMP, a framework that enables robots to actively perceive and handle unforeseen situations during task execution by integrating vision-language models and scene graph reasoning.

Contribution

The paper presents a novel planning and situation-handling framework that combines active perception, vision-language models, and scene graph reasoning for improved robot autonomy.

Findings

01

VAP-TAMP effectively detects and addresses unforeseen situations in simulation and real-world tests.

02

The framework enhances long-term autonomy by integrating perception and planning.

03

Experimental results demonstrate improved robustness in dynamic environments.

Abstract

Current robots are capable of computing plans to accomplish complex tasks. However, real-world environments are inherently open and dynamic, and unforeseen situations frequently arise during plan execution, such as jamming doors and fallen objects on the floor. These situations may result from the robot's own action failures or from external disturbances, such as human activities. Detecting and handling such execution - time situations remains a significant challenge, limiting those robots' ability to achieve long-term autonomy. In this paper, we develop a planning and situation-handling framework, called VAP-TAMP, that enables robots to actively perceive and address unforeseen situations during plan execution. VAP-TAMP leverages action knowledge to strategically prompt vision-language models for active view selection and situation assessment, while constructing and reasoning over scene…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.