UNION: Unsupervised 3D Object Detection using Object Appearance-based Pseudo-Classes
Ted Lentsch, Holger Caesar, Dariu M. Gavrila

TL;DR
UNION introduces an unsupervised 3D object detection method that uses appearance-based clustering and scene flow to identify static and dynamic objects from LiDAR data, significantly improving detection accuracy.
Contribution
It proposes a novel approach combining spatial clustering, scene flow, and appearance encoding to detect static and dynamic objects in a single training step, reducing computational costs.
Findings
More than doubles average precision to 39.5 on nuScenes
Effectively distinguishes static and dynamic objects without manual labels
Achieves state-of-the-art performance in unsupervised 3D object discovery
Abstract
Unsupervised 3D object detection methods have emerged to leverage vast amounts of data without requiring manual labels for training. Recent approaches rely on dynamic objects for learning to detect mobile objects but penalize the detections of static instances during training. Multiple rounds of self-training are used to add detected static instances to the set of training targets; this procedure to improve performance is computationally expensive. To address this, we propose the method UNION. We use spatial clustering and self-supervised scene flow to obtain a set of static and dynamic object proposals from LiDAR. Subsequently, object proposals' visual appearances are encoded to distinguish static objects in the foreground and background by selecting static instances that are visually similar to dynamic objects. As a result, static and dynamic mobile objects are obtained together, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Advanced Image and Video Retrieval Techniques · Advanced Neural Network Applications
