DOROTHIE: Spoken Dialogue for Handling Unexpected Situations in   Interactive Autonomous Driving Agents

Ziqiao Ma; Ben VanDerPloeg; Cristian-Paul Bara; Huang Yidong; Eui-In; Kim; Felix Gervits; Matthew Marge; Joyce Chai

arXiv:2210.12511·cs.AI·October 25, 2022

DOROTHIE: Spoken Dialogue for Handling Unexpected Situations in Interactive Autonomous Driving Agents

Ziqiao Ma, Ben VanDerPloeg, Cristian-Paul Bara, Huang Yidong, Eui-In, Kim, Felix Gervits, Matthew Marge, Joyce Chai

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper introduces DOROTHIE, a simulation platform and benchmark for training autonomous driving agents to handle unexpected situations through dialogue, highlighting the difficulty of language-guided navigation in dynamic environments.

Contribution

The paper presents a novel simulation platform and benchmark for evaluating dialogue-based navigation in autonomous driving, along with a transformer baseline model.

Findings

01

Language-guided navigation remains extremely challenging for end-to-end models.

02

The SDN benchmark provides extensive data for studying situated communication in driving.

03

Baseline models show significant room for improvement in handling unpredictable scenarios.

Abstract

In the real world, autonomous driving agents navigate in highly dynamic environments full of unexpected situations where pre-trained models are unreliable. In these situations, what is immediately available to vehicles is often only human operators. Empowering autonomous driving agents with the ability to navigate in a continuous and dynamic environment and to communicate with humans through sensorimotor-grounded dialogue becomes critical. To this end, we introduce Dialogue On the ROad To Handle Irregular Events (DOROTHIE), a novel interactive simulation platform that enables the creation of unexpected situations on the fly to support empirical studies on situated communication with autonomous driving agents. Based on this platform, we created the Situated Dialogue Navigation (SDN), a navigation benchmark of 183 trials with a total of 8415 utterances, around 18.7 hours of control…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sled-group/dorothie
pytorchOfficial

Datasets

sled-umich/SDN
dataset· 2.0k dl
2.0k dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Topic Modeling · Multimodal Machine Learning Applications