HAZARD Challenge: Embodied Decision Making in Dynamically Changing Environments
Qinhong Zhou, Sunli Chen, Yisong Wang, Haozhe Xu, Weihua Du, Hongxin, Zhang, Yilun Du, Joshua B. Tenenbaum, Chuang Gan

TL;DR
The paper introduces HAZARD, a new benchmark for evaluating embodied agents' decision-making in dynamic environments with unexpected disasters, leveraging large language models for reasoning.
Contribution
It presents a novel dynamic environment benchmark and explores the use of large language models to enhance decision-making in embodied agents.
Findings
LLM-based agents show promise in dynamic decision-making.
The benchmark enables evaluation across multiple decision-making pipelines.
Challenges remain in applying LLMs effectively in real-time scenarios.
Abstract
Recent advances in high-fidelity virtual environments serve as one of the major driving forces for building intelligent embodied agents to perceive, reason and interact with the physical world. Typically, these environments remain unchanged unless agents interact with them. However, in real-world scenarios, agents might also face dynamically changing environments characterized by unexpected events and need to rapidly take action accordingly. To remedy this gap, we propose a new simulated embodied benchmark, called HAZARD, specifically designed to assess the decision-making abilities of embodied agents in dynamic situations. HAZARD consists of three unexpected disaster scenarios, including fire, flood, and wind, and specifically supports the utilization of large language models (LLMs) to assist common sense reasoning and decision-making. This benchmark enables us to evaluate autonomous…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Disaster Management and Resilience
