SAFe-Copilot: Unified Shared Autonomy Framework
Phat Nguyen, Erfan Aasi, Shiva Sreeram, Guy Rosman, Andrew Silva, Sertac Karaman, Daniela Rus

TL;DR
SAFe-Copilot introduces a unified shared autonomy framework that uses vision language models to interpret driver intent and mediate control, significantly improving safety and alignment in autonomous driving scenarios.
Contribution
The paper presents a novel high-level arbitration method using language-based representations to better preserve driving intent and enhance shared autonomy performance.
Findings
Achieves perfect recall in mock-human studies
92% agreement with human participants on arbitration decisions
Reduces collision rate and improves overall performance on Bench2Drive
Abstract
Autonomous driving systems remain brittle in rare, ambiguous, and out-of-distribution scenarios, where human driver succeed through contextual reasoning. Shared autonomy has emerged as a promising approach to mitigate such failures by incorporating human input when autonomy is uncertain. However, most existing methods restrict arbitration to low-level trajectories, which represent only geometric paths and therefore fail to preserve the underlying driving intent. We propose a unified shared autonomy framework that integrates human input and autonomous planners at a higher level of abstraction. Our method leverages Vision Language Models (VLMs) to infer driver intent from multi-modal cues -- such as driver actions and environmental context -- and to synthesize coherent strategies that mediate between human and autonomous control. We first study the framework in a mock-human setting, where…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Human-Automation Interaction and Safety · Multimodal Machine Learning Applications
