Large Multimodal Models for Embodied Intelligent Driving: The Next Frontier in Self-Driving?
Long Zhang, Yuchen Xia, Bingqing Wei, Zhen Liu, Shiwen Mao, Zhu Han, Mohsen Guizani

TL;DR
This paper proposes a hybrid decision framework combining large multimodal models and deep reinforcement learning to enhance embodied intelligent driving, enabling continuous learning and joint decision-making for autonomous vehicles.
Contribution
It introduces a novel semantics and policy dual-driven hybrid framework integrating LMMs and DRL for improved autonomous driving capabilities.
Findings
Framework outperforms existing methods in lane-change planning tasks
Enables continuous learning through embodied AI interactions
Facilitates joint decision-making for autonomous driving systems
Abstract
The advent of Large Multimodal Models (LMMs) offers a promising technology to tackle the limitations of modular design in autonomous driving, which often falters in open-world scenarios requiring sustained environmental understanding and logical reasoning. Besides, embodied artificial intelligence facilitates policy optimization through closed-loop interactions to achieve the continuous learning capability, thereby advancing autonomous driving toward embodied intelligent (El) driving. However, such capability will be constrained by relying solely on LMMs to enhance EI driving without joint decision-making. This article introduces a novel semantics and policy dual-driven hybrid decision framework to tackle this challenge, ensuring continuous learning and joint decision. The framework merges LMMs for semantic understanding and cognitive representation, and deep reinforcement learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Reinforcement Learning in Robotics · Human-Automation Interaction and Safety
