LAVQA: A Latency-Aware Visual Question Answering Framework for Shared Autonomy in Self-Driving Vehicles

Shuangyu Xie; Kaiyuan Chen; Wenjing Chen; Chengyuan Qian; Christian Juette; Liu Ren; Dezhen Song; Ken Goldberg

arXiv:2511.11840·cs.RO·November 18, 2025

LAVQA: A Latency-Aware Visual Question Answering Framework for Shared Autonomy in Self-Driving Vehicles

Shuangyu Xie, Kaiyuan Chen, Wenjing Chen, Chengyuan Qian, Christian Juette, Liu Ren, Dezhen Song, Ken Goldberg

PDF

Open Access

TL;DR

LAVQA is a framework that improves shared autonomy in self-driving cars by integrating visual question answering with latency-aware risk visualization, reducing collision rates under network delays.

Contribution

This paper introduces LAVQA, a novel latency-aware VQA framework that dynamically visualizes safety risks considering network latency and human response delays.

Findings

01

LAVQA reduces collision rates by over 8 times compared to baselines.

02

It effectively visualizes safety regions under dynamic obstacles and latency.

03

The framework enhances remote operator decision-making in autonomous driving.

Abstract

When uncertainty is high, self-driving vehicles may halt for safety and benefit from the access to remote human operators who can provide high-level guidance. This paradigm, known as {shared autonomy}, enables autonomous vehicle and remote human operators to jointly formulate appropriate responses. To address critical decision timing with variable latency due to wireless network delays and human response time, we present LAVQA, a latency-aware shared autonomy framework that integrates Visual Question Answering (VQA) and spatiotemporal risk visualization. LAVQA augments visual queries with Latency-Induced COllision Map (LICOM), a dynamically evolving map that represents both temporal latency and spatial uncertainty. It enables remote operator to observe as the vehicle safety regions vary over time in the presence of dynamic obstacles and delayed responses. Closed-loop simulations in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAutonomous Vehicle Technology and Safety · Multimodal Machine Learning Applications · Human-Automation Interaction and Safety