MEGA: Masked Generative Autoencoder for Human Mesh Recovery
Gu\'enol\'e Fiche, Simon Leglaive, Xavier Alameda-Pineda, Francesc, Moreno-Noguer

TL;DR
MEGA introduces a masked generative autoencoder approach for human mesh recovery from a single image, effectively capturing ambiguity and generating accurate single or multiple mesh predictions, achieving state-of-the-art results.
Contribution
It formulates HMR as a token generation task using masked autoencoding, enabling both deterministic and stochastic predictions with improved accuracy.
Findings
Outperforms existing methods in deterministic mode.
Generates diverse meshes in stochastic mode.
Achieves state-of-the-art results on in-the-wild benchmarks.
Abstract
Human Mesh Recovery (HMR) from a single RGB image is a highly ambiguous problem, as an infinite set of 3D interpretations can explain the 2D observation equally well. Nevertheless, most HMR methods overlook this issue and make a single prediction without accounting for this ambiguity. A few approaches generate a distribution of human meshes, enabling the sampling of multiple predictions; however, none of them is competitive with the latest single-output model when making a single prediction. This work proposes a new approach based on masked generative modeling. By tokenizing the human pose and shape, we formulate the HMR task as generating a sequence of discrete tokens conditioned on an input image. We introduce MEGA, a MaskEd Generative Autoencoder trained to recover human meshes from images and partial human mesh token sequences. Given an image, our flexible generation scheme allows…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis
MethodsSparse Evolutionary Training
