DOROTHIE: Spoken Dialogue for Handling Unexpected Situations in Interactive Autonomous Driving Agents
Ziqiao Ma, Ben VanDerPloeg, Cristian-Paul Bara, Huang Yidong, Eui-In, Kim, Felix Gervits, Matthew Marge, Joyce Chai

TL;DR
This paper introduces DOROTHIE, a simulation platform and benchmark for training autonomous driving agents to handle unexpected situations through dialogue, highlighting the difficulty of language-guided navigation in dynamic environments.
Contribution
The paper presents a novel simulation platform and benchmark for evaluating dialogue-based navigation in autonomous driving, along with a transformer baseline model.
Findings
Language-guided navigation remains extremely challenging for end-to-end models.
The SDN benchmark provides extensive data for studying situated communication in driving.
Baseline models show significant room for improvement in handling unpredictable scenarios.
Abstract
In the real world, autonomous driving agents navigate in highly dynamic environments full of unexpected situations where pre-trained models are unreliable. In these situations, what is immediately available to vehicles is often only human operators. Empowering autonomous driving agents with the ability to navigate in a continuous and dynamic environment and to communicate with humans through sensorimotor-grounded dialogue becomes critical. To this end, we introduce Dialogue On the ROad To Handle Irregular Events (DOROTHIE), a novel interactive simulation platform that enables the creation of unexpected situations on the fly to support empirical studies on situated communication with autonomous driving agents. Based on this platform, we created the Situated Dialogue Navigation (SDN), a navigation benchmark of 183 trials with a total of 8415 utterances, around 18.7 hours of control…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems · Topic Modeling · Multimodal Machine Learning Applications
