Multi-view Human Body Mesh Translator

Xiangjian Jiang; Xuecheng Nie; Zitian Wang; Luoqi Liu; Si Liu

arXiv:2210.01886·cs.CV·October 6, 2022·1 cites

Multi-view Human Body Mesh Translator

Xiangjian Jiang, Xuecheng Nie, Zitian Wang, Luoqi Liu, Si Liu

PDF

Open Access

TL;DR

This paper introduces MMT, a multi-view human body mesh translator using vision transformers, which significantly improves mesh recovery accuracy by fusing multi-view features and enforcing geometric consistency.

Contribution

The novel MMT model effectively leverages multi-view images and cross-view alignment to enhance human mesh reconstruction, outperforming existing methods.

Findings

01

28.8% improvement in MPVE over state-of-the-art on HUMBI dataset

02

Outperforms existing models by a large margin

03

Produces high-quality human mesh reconstructions

Abstract

Existing methods for human mesh recovery mainly focus on single-view frameworks, but they often fail to produce accurate results due to the ill-posed setup. Considering the maturity of the multi-view motion capture system, in this paper, we propose to solve the prior ill-posed problem by leveraging multiple images from different views, thus significantly enhancing the quality of recovered meshes. In particular, we present a novel \textbf{M}ulti-view human body \textbf{M}esh \textbf{T}ranslator (MMT) model for estimating human body mesh with the help of vision transformer. Specifically, MMT takes multi-view images as input and translates them to targeted meshes in a single-forward manner. MMT fuses features of different views in both encoding and decoding phases, leading to representations embedded with global information. Additionally, to ensure the tokens are intensively focused on the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · 3D Shape Modeling and Analysis · Video Surveillance and Tracking Methods