ARM3D: Attention-based relation module for indoor 3D object detection

Yuqing Lan; Yao Duan; Chenyi Liu; Chenyang Zhu; Yueshan Xiong; Hui; Huang; Kai Xu

arXiv:2202.09715·cs.CV·March 6, 2025

ARM3D: Attention-based relation module for indoor 3D object detection

Yuqing Lan, Yao Duan, Chenyi Liu, Chenyang Zhu, Yueshan Xiong, Hui, Huang, Kai Xu

PDF

Open Access 1 Repo

TL;DR

ARM3D introduces an attention-based relation module that enhances 3D object detection by focusing on relevant relation contexts, reducing noise and ambiguity for more accurate and robust results.

Contribution

The paper proposes a novel ARM3D module that uses attention mechanisms to selectively extract useful relation context in 3D detection, improving performance over existing methods.

Findings

01

ARM3D improves detection accuracy when integrated into state-of-the-art detectors.

02

ARM3D demonstrates robustness and generalization across various 3D detection scenarios.

03

The module effectively filters out noisy relation contexts, reducing scene ambiguity.

Abstract

Relation context has been proved to be useful for many challenging vision tasks. In the field of 3D object detection, previous methods have been taking the advantage of context encoding, graph embedding, or explicit relation reasoning to extract relation context. However, there exists inevitably redundant relation context due to noisy or low-quality proposals. In fact, invalid relation context usually indicates underlying scene misunderstanding and ambiguity, which may, on the contrary, reduce the performance in complex scenes. Inspired by recent attention mechanism like Transformer, we propose a novel 3D attention-based relation module (ARM3D). It encompasses object-aware relation reasoning to extract pair-wise relation contexts among qualified proposals and an attention module to distribute attention weights towards different relation contexts. In this way, ARM3D can take full…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lanlan96/arm3d
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Adam · Label Smoothing · Position-Wise Feed-Forward Layer · Dense Connections · Dropout · Softmax