WOMD-Reasoning: A Large-Scale Dataset for Interaction Reasoning in Driving

Yiheng Li; Cunxin Fan; Chongjian Ge; Zhihao Zhao; Chenran Li; Chenfeng Xu; Huaxiu Yao; Masayoshi Tomizuka; Bolei Zhou; Chen Tang; Mingyu Ding; Wei Zhan

arXiv:2407.04281·cs.RO·May 27, 2025

WOMD-Reasoning: A Large-Scale Dataset for Interaction Reasoning in Driving

Yiheng Li, Cunxin Fan, Chongjian Ge, Zhihao Zhao, Chenran Li, Chenfeng Xu, Huaxiu Yao, Masayoshi Tomizuka, Bolei Zhou, Chen Tang, Mingyu Ding, Wei Zhan

PDF

Open Access 1 Repo

TL;DR

WOMD-Reasoning is a large-scale, multi-modal Q&A dataset focused on reasoning about traffic rule-induced interactions in driving scenarios, enabling improved analysis of driving behaviors and interactions.

Contribution

The paper introduces WOMD-Reasoning, the largest multi-modal Q&A dataset for traffic interaction reasoning, and demonstrates its application through the Motion-LLaVA model.

Findings

01

WOMD-Reasoning contains 3 million Q&As covering diverse driving topics.

02

Motion-LLaVA effectively utilizes WOMD-Reasoning for interaction reasoning.

03

The dataset supports applications like interaction prediction and traffic rule compliance planning.

Abstract

Language models uncover unprecedented abilities in analyzing driving scenarios, owing to their limitless knowledge accumulated from text-based pre-training. Naturally, they should particularly excel in analyzing rule-based interactions, such as those triggered by traffic laws, which are well documented in texts. However, such interaction analysis remains underexplored due to the lack of dedicated language datasets that address it. Therefore, we propose Waymo Open Motion Dataset-Reasoning (WOMD-Reasoning), a comprehensive large-scale Q&As dataset built on WOMD focusing on describing and reasoning traffic rule-induced interactions in driving scenarios. WOMD-Reasoning also presents by far the largest multi-modal Q&A dataset, with 3 million Q&As on real-world driving scenarios, covering a wide range of driving topics from map descriptions and motion status descriptions to narratives and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yhli123/womd-reasoning
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems