MLCVNet: Multi-Level Context VoteNet for 3D Object Detection
Qian Xie, Yu-Kun Lai, Jing Wu, Zhoutao Wang, Yiming Zhang, Kai Xu, Jun, Wang

TL;DR
MLCVNet enhances 3D object detection by integrating multi-level contextual information through self-attention and feature fusion, significantly improving accuracy on SUN RGBD and ScanNet datasets.
Contribution
We introduce MLCVNet, a novel 3D detection framework that incorporates patch, object, and scene level context modules into VoteNet for better recognition.
Findings
Achieves state-of-the-art results on SUN RGBD dataset.
Outperforms existing methods on ScanNet dataset.
Effectively captures multi-level contextual information.
Abstract
In this paper, we address the 3D object detection task by capturing multi-level contextual information with the self-attention mechanism and multi-scale feature fusion. Most existing 3D object detection methods recognize objects individually, without giving any consideration on contextual information between these objects. Comparatively, we propose Multi-Level Context VoteNet (MLCVNet) to recognize 3D objects correlatively, building on the state-of-the-art VoteNet. We introduce three context modules into the voting and classifying stages of VoteNet to encode contextual information at different levels. Specifically, a Patch-to-Patch Context (PPC) module is employed to capture contextual information between the point patches, before voting for their corresponding object centroid points. Subsequently, an Object-to-Object Context (OOC) module is incorporated before the proposal and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
MLCVNet: Multi-Level Context VoteNet for 3D Object Detection· youtube
Taxonomy
TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Multimodal Machine Learning Applications
