Self-Supervised Sparse Sensor Fusion for Long Range Perception
Edoardo Palladin, Samuel Brucker, Filippo Ghilotti, Praveen Narayanan, Mario Bijelic, Felix Heide

TL;DR
This paper introduces a self-supervised sparse sensor fusion method that significantly extends perception range for autonomous vehicles to 250 meters, improving detection accuracy and forecasting in long-range highway scenarios.
Contribution
We propose a novel 3D encoding and self-supervised pre-training scheme for sensor fusion, enabling efficient long-range perception beyond existing BEV-based methods.
Findings
Achieved 26.6% improvement in object detection mAP at 250m
Reduced Chamfer Distance by 30.5% in LiDAR forecasting
Extended perception range to 250 meters for highway driving
Abstract
Outside of urban hubs, autonomous cars and trucks have to master driving on intercity highways. Safe, long-distance highway travel at speeds exceeding 100 km/h demands perception distances of at least 250 m, which is about five times the 50-100m typically addressed in city driving, to allow sufficient planning and braking margins. Increasing the perception ranges also allows to extend autonomy from light two-ton passenger vehicles to large-scale forty-ton trucks, which need a longer planning horizon due to their high inertia. However, most existing perception approaches focus on shorter ranges and rely on Bird's Eye View (BEV) representations, which incur quadratic increases in memory and compute costs as distance grows. To overcome this limitation, we built on top of a sparse representation and introduced an efficient 3D encoding of multi-modal and temporal features, along with a novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
