ARM3D: Attention-based relation module for indoor 3D object detection
Yuqing Lan, Yao Duan, Chenyi Liu, Chenyang Zhu, Yueshan Xiong, Hui, Huang, Kai Xu

TL;DR
ARM3D introduces an attention-based relation module that enhances 3D object detection by focusing on relevant relation contexts, reducing noise and ambiguity for more accurate and robust results.
Contribution
The paper proposes a novel ARM3D module that uses attention mechanisms to selectively extract useful relation context in 3D detection, improving performance over existing methods.
Findings
ARM3D improves detection accuracy when integrated into state-of-the-art detectors.
ARM3D demonstrates robustness and generalization across various 3D detection scenarios.
The module effectively filters out noisy relation contexts, reducing scene ambiguity.
Abstract
Relation context has been proved to be useful for many challenging vision tasks. In the field of 3D object detection, previous methods have been taking the advantage of context encoding, graph embedding, or explicit relation reasoning to extract relation context. However, there exists inevitably redundant relation context due to noisy or low-quality proposals. In fact, invalid relation context usually indicates underlying scene misunderstanding and ambiguity, which may, on the contrary, reduce the performance in complex scenes. Inspired by recent attention mechanism like Transformer, we propose a novel 3D attention-based relation module (ARM3D). It encompasses object-aware relation reasoning to extract pair-wise relation contexts among qualified proposals and an attention module to distribute attention weights towards different relation contexts. In this way, ARM3D can take full…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Adam · Label Smoothing · Position-Wise Feed-Forward Layer · Dense Connections · Dropout · Softmax
