AdaptiveFusion: Adaptive Multi-Modal Multi-View Fusion for 3D Human Body   Reconstruction

Anjun Chen; Xiangyu Wang; Zhi Xu; Kun Shi; Yan Qin; Yuchi Huo; Jiming; Chen; Qi Ye

arXiv:2409.04851·cs.CV·March 14, 2025

AdaptiveFusion: Adaptive Multi-Modal Multi-View Fusion for 3D Human Body Reconstruction

Anjun Chen, Xiangyu Wang, Zhi Xu, Kun Shi, Yan Qin, Yuchi Huo, Jiming, Chen, Qi Ye

PDF

Open Access

TL;DR

AdaptiveFusion introduces a flexible, robust multi-modal fusion framework for 3D human body reconstruction that effectively handles arbitrary sensor inputs and noise, outperforming existing methods.

Contribution

It presents a novel adaptive multi-modal fusion approach using Transformer-based techniques that generalizes across sensor setups and manages noisy data with a single training model.

Findings

01

Achieves high-quality 3D reconstruction in diverse environments.

02

Outperforms state-of-the-art fusion methods in accuracy.

03

Handles arbitrary sensor combinations and noise effectively.

Abstract

Recent advancements in sensor technology and deep learning have led to significant progress in 3D human body reconstruction. However, most existing approaches rely on data from a specific sensor, which can be unreliable due to the inherent limitations of individual sensing modalities. Additionally, existing multi-modal fusion methods generally require customized designs based on the specific sensor combinations or setups, which limits the flexibility and generality of these methods. Furthermore, conventional point-image projection-based and Transformer-based fusion networks are susceptible to the influence of noisy modalities and sensor poses. To address these limitations and achieve robust 3D human body reconstruction in various conditions, we propose AdaptiveFusion, a generic adaptive multi-modal multi-view fusion framework that can effectively incorporate arbitrary combinations of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · 3D Shape Modeling and Analysis · Industrial Vision Systems and Defect Detection

MethodsAttention Is All You Need · Byte Pair Encoding · Absolute Position Encodings · Softmax · Label Smoothing · Dropout · Layer Normalization · Position-Wise Feed-Forward Layer · Linear Layer · Adam