Temporal Misalignment Attacks against Multimodal Perception in Autonomous Driving

Md Hasan Shahriar; Md Mohaimin Al Barat; Harshavardhan Sundar; Ning Zhang; Naren Ramakrishnan; Y. Thomas Hou; Wenjing Lou

arXiv:2507.09095·cs.LG·March 9, 2026

Temporal Misalignment Attacks against Multimodal Perception in Autonomous Driving

Md Hasan Shahriar, Md Mohaimin Al Barat, Harshavardhan Sundar, Ning Zhang, Naren Ramakrishnan, Y. Thomas Hou, Wenjing Lou

PDF

TL;DR

This paper introduces DejaVu, a novel attack exploiting temporal misalignments in multimodal perception systems of autonomous vehicles, significantly degrading perception accuracy and potentially causing safety hazards.

Contribution

The paper presents DejaVu, the first attack exploiting in-vehicular network vulnerabilities to induce temporal misalignments in multimodal perception for autonomous driving.

Findings

01

Object detection accuracy drops up to 88.5% with LiDAR delay.

02

Object tracking accuracy decreases by 73% with camera delay.

03

Feasibility demonstrated through hardware-in-the-loop and simulation tests.

Abstract

Multimodal fusion (MMF) plays a critical role in the perception of autonomous driving, which primarily fuses camera and LiDAR streams for a comprehensive and efficient scene understanding. However, its strict reliance on precise temporal synchronization exposes it to new vulnerabilities. In this paper, we introduce DejaVu, an attack that exploits the in-vehicular network to manipulate the integrity of time and create subtle temporal misalignments, severely degrading downstream MMF-based perception tasks. Our comprehensive attack analysis across different models and datasets reveals the sensors' task-specific imbalanced sensitivities: object detection is overly dependent on LiDAR inputs, while object tracking is highly reliant on the camera inputs. Consequently, with a single-frame LiDAR delay, an attacker can reduce the car detection mAP by up to 88.5%, while with a three-frame camera…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.