Multi-branch Collaborative Learning Network for 3D Visual Grounding
Zhipeng Qian, Yiwei Ma, Zhekai Lin, Jiayi Ji, Xiawu Zheng, Xiaoshuai, Sun, Rongrong Ji

TL;DR
This paper introduces a multi-branch framework for 3D visual grounding tasks, employing dedicated branches and novel modules to improve collaboration and achieve state-of-the-art results in 3D referring expression comprehension and segmentation.
Contribution
The proposed MCLN framework with RSA and ASA modules enables effective independent learning and mutual reinforcement for 3DREC and 3DRES, surpassing previous collaborative methods.
Findings
Achieves 2.05% higher accuracy in 3DREC at [email protected]
Improves 3DRES mIoU by 3.96%
Demonstrates state-of-the-art performance on both tasks
Abstract
3D referring expression comprehension (3DREC) and segmentation (3DRES) have overlapping objectives, indicating their potential for collaboration. However, existing collaborative approaches predominantly depend on the results of one task to make predictions for the other, limiting effective collaboration. We argue that employing separate branches for 3DREC and 3DRES tasks enhances the model's capacity to learn specific information for each task, enabling them to acquire complementary knowledge. Thus, we propose the MCLN framework, which includes independent branches for 3DREC and 3DRES tasks. This enables dedicated exploration of each task and effective coordination between the branches. Furthermore, to facilitate mutual reinforcement between these branches, we introduce a Relative Superpoint Aggregation (RSA) module and an Adaptive Soft Alignment (ASA) module. These modules…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Simulation and Modeling Applications
MethodsSoftmax · Attention Is All You Need
