Long-Range Grouping Transformer for Multi-View 3D Reconstruction
Liying Yang, Zhenwei Zhu, Xuxin Lin, Jian Nong, Yanyan Liang

TL;DR
This paper introduces LRGT, a transformer-based network with long-range grouping attention for multi-view 3D reconstruction, effectively handling complex view information and achieving state-of-the-art accuracy on ShapeNet.
Contribution
The paper proposes a novel long-range grouping attention mechanism and a progressive upsampling decoder, improving multi-view 3D reconstruction performance.
Findings
Achieves state-of-the-art accuracy on ShapeNet dataset.
Effectively handles complex multi-view information with LGA.
Outperforms previous methods in 3D reconstruction quality.
Abstract
Nowadays, transformer networks have demonstrated superior performance in many computer vision tasks. In a multi-view 3D reconstruction algorithm following this paradigm, self-attention processing has to deal with intricate image tokens including massive information when facing heavy amounts of view input. The curse of information content leads to the extreme difficulty of model learning. To alleviate this problem, recent methods compress the token number representing each view or discard the attention operations between the tokens from different views. Obviously, they give a negative impact on performance. Therefore, we propose long-range grouping attention (LGA) based on the divide-and-conquer principle. Tokens from all views are grouped for separate attention operations. The tokens in each group are sampled from all views and can provide macro representation for the resided view. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Long-Range Grouping Transformer for Multi-View 3D Reconstruction· youtube
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image and Video Retrieval Techniques · Optical measurement and interference techniques
