Locomotion-Action-Manipulation: Synthesizing Human-Scene Interactions in Complex 3D Environments
Jiye Lee, Hanbyul Joo

TL;DR
LAMA is a unified framework that synthesizes realistic long-term human motions involving locomotion, interaction, and manipulation in complex 3D environments using test-time optimization and reinforcement learning.
Contribution
It introduces a novel test-time optimization approach for human motion synthesis that does not require paired scene data, integrating locomotion, interaction, and manipulation.
Findings
LAMA outperforms previous methods in realism and diversity of synthesized motions.
The framework effectively handles complex indoor environments and varied human behaviors.
Extensive experiments validate the superiority of LAMA in challenging scenarios.
Abstract
Synthesizing interaction-involved human motions has been challenging due to the high complexity of 3D environments and the diversity of possible human behaviors within. We present LAMA, Locomotion-Action-MAnipulation, to synthesize natural and plausible long-term human movements in complex indoor environments. The key motivation of LAMA is to build a unified framework to encompass a series of everyday motions including locomotion, scene interaction, and object manipulation. Unlike existing methods that require motion data "paired" with scanned 3D scenes for supervision, we formulate the problem as a test-time optimization by using human motion capture data only for synthesis. LAMA leverages a reinforcement learning framework coupled with a motion matching algorithm for optimization, and further exploits a motion editing framework via manifold learning to cover possible variations in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Locomotion-Action-Manipulation: Synthesizing Human-Scene Interactions in Complex 3D Environments· youtube
Taxonomy
TopicsHuman Motion and Animation · Human Pose and Action Recognition · 3D Shape Modeling and Analysis
MethodsSoftmax · Tanh Activation · Low-Rank Factorization-based Multi-Head Attention
