Learning 3D Semantics from Pose-Noisy 2D Images with Hierarchical Full   Attention Network

Yuhang He; Lin Chen; Junkun Xie; Long Chen

arXiv:2204.08084·cs.CV·April 28, 2022

Learning 3D Semantics from Pose-Noisy 2D Images with Hierarchical Full Attention Network

Yuhang He, Lin Chen, Junkun Xie, Long Chen

PDF

Open Access 1 Repo

TL;DR

This paper introduces HiFANet, a hierarchical attention network that effectively learns 3D semantics from pose-noisy 2D images by aggregating multi-view semantic cues, improving accuracy and robustness in 3D semantic segmentation.

Contribution

The paper presents a novel hierarchical full attention network that leverages multi-view 2D images with pose noise for 3D semantic segmentation, reducing data requirements and enhancing noise tolerance.

Findings

01

Outperforms existing 3D point cloud methods significantly.

02

Requires less training data.

03

Demonstrates robustness to pose noise.

Abstract

We propose a novel framework to learn 3D point cloud semantics from 2D multi-view image observations containing pose error. On the one hand, directly learning from the massive, unstructured and unordered 3D point cloud is computationally and algorithmically more difficult than learning from compactly-organized and context-rich 2D RGB images. On the other hand, both LiDAR point cloud and RGB images are captured in standard automated-driving datasets. This motivates us to conduct a "task transfer" paradigm so that 3D semantic segmentation benefits from aggregating 2D semantic cues, albeit pose noises are contained in 2D image observations. Among all difficulties, pose noise and erroneous prediction from 2D semantic segmentation approaches are the main challenges for the task transfer. To alleviate the influence of those factor, we perceive each 3D point using multi-view images and for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yuhanghe01/hifanet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Advanced Vision and Imaging · Human Pose and Action Recognition