ADAPT: Benchmarking Commonsense Planning under Unspecified Affordance Constraints
Pei-An Chen, Yong-Ching Liang, Jia-Fong Yeh, Hung-Ting Su, Yi-Ting Chen, Min Sun, Winston Hsu

TL;DR
This paper introduces ADAPT, a module that enhances embodied agents with affordance reasoning, improving their robustness and success in dynamic, unpredictable environments by perceiving object states and inferring implicit preconditions.
Contribution
The paper presents DynAfford, a new benchmark for testing agents' ability to handle changing affordances, and introduces ADAPT, a plug-and-play module for explicit affordance reasoning.
Findings
ADAPT improves task success in dynamic environments.
A domain-adapted vision-language model outperforms GPT-4o for affordance inference.
Incorporating affordance reasoning enhances robustness of embodied agents.
Abstract
Intelligent embodied agents should not simply follow instructions, as real-world environments often involve unexpected conditions and exceptions. However, existing methods usually focus on directly executing instructions, without considering whether the target objects can actually be manipulated, meaning they fail to assess available affordances. To address this limitation, we introduce DynAfford, a benchmark that evaluates embodied agents in dynamic environments where object affordances may change over time and are not specified in the instruction. DynAfford requires agents to perceive object states, infer implicit preconditions, and adapt their actions accordingly. To enable this capability, we introduce ADAPT, a plug-and-play module that augments existing planners with explicit affordance reasoning. Experiments demonstrate that incorporating ADAPT significantly improves robustness…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
