Locomotion-Action-Manipulation: Synthesizing Human-Scene Interactions in   Complex 3D Environments

Jiye Lee; Hanbyul Joo

arXiv:2301.02667·cs.CV·September 11, 2023

Locomotion-Action-Manipulation: Synthesizing Human-Scene Interactions in Complex 3D Environments

Jiye Lee, Hanbyul Joo

PDF

Open Access 1 Video

TL;DR

LAMA is a unified framework that synthesizes realistic long-term human motions involving locomotion, interaction, and manipulation in complex 3D environments using test-time optimization and reinforcement learning.

Contribution

It introduces a novel test-time optimization approach for human motion synthesis that does not require paired scene data, integrating locomotion, interaction, and manipulation.

Findings

01

LAMA outperforms previous methods in realism and diversity of synthesized motions.

02

The framework effectively handles complex indoor environments and varied human behaviors.

03

Extensive experiments validate the superiority of LAMA in challenging scenarios.

Abstract

Synthesizing interaction-involved human motions has been challenging due to the high complexity of 3D environments and the diversity of possible human behaviors within. We present LAMA, Locomotion-Action-MAnipulation, to synthesize natural and plausible long-term human movements in complex indoor environments. The key motivation of LAMA is to build a unified framework to encompass a series of everyday motions including locomotion, scene interaction, and object manipulation. Unlike existing methods that require motion data "paired" with scanned 3D scenes for supervision, we formulate the problem as a test-time optimization by using human motion capture data only for synthesis. LAMA leverages a reinforcement learning framework coupled with a motion matching algorithm for optimization, and further exploits a motion editing framework via manifold learning to cover possible variations in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Locomotion-Action-Manipulation: Synthesizing Human-Scene Interactions in Complex 3D Environments· youtube

Taxonomy

TopicsHuman Motion and Animation · Human Pose and Action Recognition · 3D Shape Modeling and Analysis

MethodsSoftmax · Tanh Activation · Low-Rank Factorization-based Multi-Head Attention