Explanation for Trajectory Planning using Multi-modal Large Language Model for Autonomous Driving
Shota Yamazaki, Chenyu Zhang, Takuya Nanri, Akio Shigekane, Siyuan, Wang, Jo Nishiyama, Tao Chu, Kohei Yokosawa

TL;DR
This paper introduces a reasoning model for autonomous driving that uses future planning trajectories as inputs to generate interpretable captions, enhancing transparency of decision-making.
Contribution
It presents a novel approach that incorporates future planning trajectories into caption generation, improving interpretability over existing methods.
Findings
The model effectively reflects future plans in generated captions.
Enhanced interpretability of autonomous vehicle decisions.
New dataset collected for training and evaluation.
Abstract
End-to-end style autonomous driving models have been developed recently. These models lack interpretability of decision-making process from perception to control of the ego vehicle, resulting in anxiety for passengers. To alleviate it, it is effective to build a model which outputs captions describing future behaviors of the ego vehicle and their reason. However, the existing approaches generate reasoning text that inadequately reflects the future plans of the ego vehicle, because they train models to output captions using momentary control signals as inputs. In this study, we propose a reasoning model that takes future planning trajectories of the ego vehicle as inputs to solve this limitation with the dataset newly collected.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
