NovPhy: A Testbed for Physical Reasoning in Open-world Environments
Chathura Gamage, Vimukthini Pinto, Cheng Xue, Peng Zhang, Ekaterina, Nikonova, Matthew Stephenson, Jochen Renz

TL;DR
NovPhy is a new testbed designed to evaluate and develop AI agents' physical reasoning and adaptability in open-world environments with novel situations, highlighting current performance gaps between humans and AI.
Contribution
The paper introduces NovPhy, a comprehensive testbed for assessing physical reasoning and adaptability to novelties in open-world scenarios, with diverse tasks and evaluation metrics.
Findings
Humans outperform AI agents in physical reasoning and adaptability.
AI agents struggle to adapt quickly to novelties in physical scenarios.
Agents with adaptability capabilities perform worse under novelty conditions than humans.
Abstract
Due to the emergence of AI systems that interact with the physical environment, there is an increased interest in incorporating physical reasoning capabilities into those AI systems. But is it enough to only have physical reasoning capabilities to operate in a real physical environment? In the real world, we constantly face novel situations we have not encountered before. As humans, we are competent at successfully adapting to those situations. Similarly, an agent needs to have the ability to function under the impact of novelties in order to properly operate in an open-world physical environment. To facilitate the development of such AI systems, we propose a new testbed, NovPhy, that requires an agent to reason about physical scenarios in the presence of novelties and take actions accordingly. The testbed consists of tasks that require agents to detect and adapt to novelties in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Agent Systems and Negotiation · Context-Aware Activity Recognition Systems · Semantic Web and Ontologies
