HAZARD Challenge: Embodied Decision Making in Dynamically Changing   Environments

Qinhong Zhou; Sunli Chen; Yisong Wang; Haozhe Xu; Weihua Du; Hongxin; Zhang; Yilun Du; Joshua B. Tenenbaum; Chuang Gan

arXiv:2401.12975·cs.CV·January 24, 2024·2 cites

HAZARD Challenge: Embodied Decision Making in Dynamically Changing Environments

Qinhong Zhou, Sunli Chen, Yisong Wang, Haozhe Xu, Weihua Du, Hongxin, Zhang, Yilun Du, Joshua B. Tenenbaum, Chuang Gan

PDF

Open Access 1 Repo 1 Datasets 1 Video

TL;DR

The paper introduces HAZARD, a new benchmark for evaluating embodied agents' decision-making in dynamic environments with unexpected disasters, leveraging large language models for reasoning.

Contribution

It presents a novel dynamic environment benchmark and explores the use of large language models to enhance decision-making in embodied agents.

Findings

01

LLM-based agents show promise in dynamic decision-making.

02

The benchmark enables evaluation across multiple decision-making pipelines.

03

Challenges remain in applying LLMs effectively in real-time scenarios.

Abstract

Recent advances in high-fidelity virtual environments serve as one of the major driving forces for building intelligent embodied agents to perceive, reason and interact with the physical world. Typically, these environments remain unchanged unless agents interact with them. However, in real-world scenarios, agents might also face dynamically changing environments characterized by unexpected events and need to rapidly take action accordingly. To remedy this gap, we propose a new simulated embodied benchmark, called HAZARD, specifically designed to assess the decision-making abilities of embodied agents in dynamic situations. HAZARD consists of three unexpected disaster scenarios, including fire, flood, and wind, and specifically supports the utilization of large language models (LLMs) to assist common sense reasoning and decision-making. This benchmark enables us to evaluate autonomous…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

umass-foundation-model/hazard
pytorchOfficial

Datasets

oscarqjh/HAZARD_easi
dataset· 14 dl
14 dl

Videos

HAZARD Challenge: Embodied Decision Making in Dynamically Changing Environments· slideslive

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Disaster Management and Resilience