VANETs Meet Autonomous Vehicles: A Multimodal 3D Environment Learning Approach
Yassine Maalej, Sameh Sorour, Ahmed Abdel-Rahim, Mohsen Guizani

TL;DR
This paper introduces a multimodal framework combining stereo camera images, Lidar scans, and V2V safety messages to improve object detection, recognition, and mapping in autonomous vehicle environments, leveraging manifold alignment techniques.
Contribution
It presents a novel multimodal fusion approach that integrates camera, Lidar, and V2V data for enhanced 3D environment understanding in autonomous vehicles.
Findings
Effective camera-Lidar and camera-V2V object mapping achieved
Improved object recognition accuracy through multimodal fusion
Demonstrated on the KITTI benchmark suite
Abstract
In this paper, we design a multimodal framework for object detection, recognition and mapping based on the fusion of stereo camera frames, point cloud Velodyne Lidar scans, and Vehicle-to-Vehicle (V2V) Basic Safety Messages (BSMs) exchanged using Dedicated Short Range Communication (DSRC). We merge the key features of rich texture descriptions of objects from 2D images, depth and distance between objects provided by 3D point cloud and awareness of hidden vehicles from BSMs' 3D information. We present a joint pixel to point cloud and pixel to V2V correspondences of objects in frames from the Kitti Vision Benchmark Suite by using a semi-supervised manifold alignment approach to achieve camera-Lidar and camera-V2V mapping of their recognized objects that have the same underlying manifold.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Autonomous Vehicle Technology and Safety · Video Surveillance and Tracking Methods
