A Survey on Multimodal Large Language Models for Autonomous Driving

Can Cui; Yunsheng Ma; Xu Cao; Wenqian Ye; Yang Zhou; Kaizhao Liang,; Jintai Chen; Juanwu Lu; Zichong Yang; Kuei-Da Liao; Tianren Gao; Erlong Li,; Kun Tang; Zhipeng Cao; Tong Zhou; Ao Liu; Xinrui Yan; Shuqi Mei; Jianguo Cao,; Ziran Wang; Chao Zheng

arXiv:2311.12320·cs.AI·November 22, 2023·27 cites

A Survey on Multimodal Large Language Models for Autonomous Driving

Can Cui, Yunsheng Ma, Xu Cao, Wenqian Ye, Yang Zhou, Kaizhao Liang,, Jintai Chen, Juanwu Lu, Zichong Yang, Kuei-Da Liao, Tianren Gao, Erlong Li,, Kun Tang, Zhipeng Cao, Tong Zhou, Ao Liu, Xinrui Yan, Shuqi Mei, Jianguo Cao,, Ziran Wang, Chao Zheng

PDF

Open Access 1 Repo

TL;DR

This survey reviews the development, current tools, datasets, benchmarks, and challenges of multimodal large language models in autonomous driving, highlighting their potential and outlining future research directions.

Contribution

It provides the first comprehensive overview of MLLMs in autonomous driving, including recent developments, existing tools, datasets, benchmarks, and key challenges to address.

Findings

01

Overview of MLLMs development and history in autonomous driving

02

Summary of existing tools, datasets, and benchmarks

03

Identification of key challenges and future research directions

Abstract

With the emergence of Large Language Models (LLMs) and Vision Foundation Models (VFMs), multimodal AI systems benefiting from large models have the potential to equally perceive the real world, make decisions, and control tools as humans. In recent months, LLMs have shown widespread attention in autonomous driving and map systems. Despite its immense potential, there is still a lack of a comprehensive understanding of key challenges, opportunities, and future endeavors to apply in LLM driving systems. In this paper, we present a systematic investigation in this field. We first introduce the background of Multimodal Large Language Models (MLLMs), the multimodal models development using LLMs, and the history of autonomous driving. Then, we overview existing MLLM tools for driving, transportation, and map systems together with existing datasets and benchmarks. Moreover, we summarized the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

irohxu/awesome-multimodal-llm-autonomous-driving
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques