Cross-Modal Phantom: Coordinated Camera-LiDAR Spoofing Against Multi-Sensor Fusion in Autonomous Vehicles
Shahriar Rahman Khan, Raiful Hasan

TL;DR
This paper demonstrates a novel coordinated spoofing attack on autonomous vehicle perception systems that can deceive multi-sensor fusion by fabricating consistent false signals across camera and LiDAR sensors.
Contribution
It introduces a simulated cross-modal spoofing method that exposes a critical vulnerability in multi-sensor fusion for autonomous vehicles, achieving high success rates in deception.
Findings
85.5% success rate in deceiving perception models
Simulated sensor-level spoofing mimics physical IR and LiDAR attacks
Reveals a critical vulnerability in multi-sensor fusion systems
Abstract
Autonomous Vehicles (AVs) increasingly depend on Multi-Sensor Fusion (MSF) to combine complementary modalities such as cameras and LiDAR for robust perception. While this redundancy is intended to safeguard against single-sensor failures, the fusion process itself introduces a subtle and underexplored vulnerability. In this work, we investigate whether an attacker can bypass MSF's redundancy by fabricating cross-sensor consistency, making multiple sensors agree on the same false object. We design a coordinated, data-level (early-fusion) attack that emulates the outcome of two synchronized physical spoofing sources: an infrared (IR) projection that induces a false camera detection and a LiDAR signal injection that produces a matching 3D point cluster. Rather than implementing the physical attack hardware, we simulate its sensor-level outcomes by inserting perspective-aware image patches…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
