MEGA: Masked Generative Autoencoder for Human Mesh Recovery

Gu\'enol\'e Fiche; Simon Leglaive; Xavier Alameda-Pineda; Francesc; Moreno-Noguer

arXiv:2405.18839·cs.CV·March 19, 2025

MEGA: Masked Generative Autoencoder for Human Mesh Recovery

Gu\'enol\'e Fiche, Simon Leglaive, Xavier Alameda-Pineda, Francesc, Moreno-Noguer

PDF

Open Access

TL;DR

MEGA introduces a masked generative autoencoder approach for human mesh recovery from a single image, effectively capturing ambiguity and generating accurate single or multiple mesh predictions, achieving state-of-the-art results.

Contribution

It formulates HMR as a token generation task using masked autoencoding, enabling both deterministic and stochastic predictions with improved accuracy.

Findings

01

Outperforms existing methods in deterministic mode.

02

Generates diverse meshes in stochastic mode.

03

Achieves state-of-the-art results on in-the-wild benchmarks.

Abstract

Human Mesh Recovery (HMR) from a single RGB image is a highly ambiguous problem, as an infinite set of 3D interpretations can explain the 2D observation equally well. Nevertheless, most HMR methods overlook this issue and make a single prediction without accounting for this ambiguity. A few approaches generate a distribution of human meshes, enabling the sampling of multiple predictions; however, none of them is competitive with the latest single-output model when making a single prediction. This work proposes a new approach based on masked generative modeling. By tokenizing the human pose and shape, we formulate the HMR task as generating a sequence of discrete tokens conditioned on an input image. We introduce MEGA, a MaskEd Generative Autoencoder trained to recover human meshes from images and partial human mesh token sequences. Given an image, our flexible generation scheme allows…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis

MethodsSparse Evolutionary Training