Streaming Object Detection for 3-D Point Clouds
Wei Han, Zhengdong Zhang, Benjamin Caine, Brandon Yang, Christoph, Sprunk, Ouais Alsharif, Jiquan Ngiam, Vijay Vasudevan, Jonathon Shlens,, Zhifeng Chen

TL;DR
This paper introduces a streaming object detection approach for LiDAR point clouds that significantly reduces latency by operating on native streaming data, outperforming traditional methods in speed while maintaining high accuracy.
Contribution
The work presents a novel streaming detection system that processes LiDAR data in real-time, reducing latency and computational peaks compared to traditional batch processing methods.
Findings
Achieves 3-15 times lower latency than traditional systems.
Maintains competitive or superior detection accuracy.
Spreads computation over scan acquisition time for efficiency.
Abstract
Autonomous vehicles operate in a dynamic environment, where the speed with which a vehicle can perceive and react impacts the safety and efficacy of the system. LiDAR provides a prominent sensory modality that informs many existing perceptual systems including object detection, segmentation, motion estimation, and action recognition. The latency for perceptual systems based on point cloud data can be dominated by the amount of time for a complete rotational scan (e.g. 100 ms). This built-in data capture latency is artificial, and based on treating the point cloud as a camera image in order to leverage camera-inspired architectures. However, unlike camera sensors, most LiDAR point cloud data is natively a streaming data source in which laser reflections are sequentially recorded based on the precession of the laser beam. In this work, we explore how to build an object detector that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Optical Sensing Technologies · Autonomous Vehicle Technology and Safety
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
