Drive My Way: Preference Alignment of Vision-Language-Action Model for Personalized Driving
Zehao Wang, Huaide Jiang, Shuaiwu Dong, Yuping Wang, Hang Qiu, Jiachen Li

TL;DR
Drive My Way introduces a personalized vision-language-action framework for autonomous driving that adapts to individual driver habits and natural language instructions, enhancing human-centered driving experiences.
Contribution
It presents a novel personalized driving model that learns user embeddings and incorporates natural language guidance, enabling adaptive and individualized autonomous driving behaviors.
Findings
Improves style instruction adaptation on Bench2Drive
Generated behaviors are recognizable as individual drivers' styles
Enhances personalization in autonomous driving systems
Abstract
Human driving behavior is inherently personal, which is shaped by long-term habits and influenced by short-term intentions. Individuals differ in how they accelerate, brake, merge, yield, and overtake across diverse situations. However, existing end-to-end autonomous driving systems either optimize for generic objectives or rely on fixed driving modes, lacking the ability to adapt to individual preferences or interpret natural language intent. To address this gap, we propose Drive My Way (DMW), a personalized Vision-Language-Action (VLA) driving framework that aligns with users' long-term driving habits and adapts to real-time user instructions. DMW learns a user embedding from our personalized driving dataset collected across multiple real drivers and conditions the policy on this embedding during planning, while natural language instructions provide additional short-term guidance.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Human-Automation Interaction and Safety · Multimodal Machine Learning Applications
