SFV: Reinforcement Learning of Physical Skills from Videos
Xue Bin Peng, Angjoo Kanazawa, Jitendra Malik, Pieter Abbeel, Sergey, Levine

TL;DR
This paper introduces SFV, a method that enables physically simulated characters to learn diverse skills from videos using deep pose estimation and reinforcement learning, reducing reliance on motion capture data.
Contribution
The paper presents a novel approach combining deep pose estimation and reinforcement learning to learn physical skills directly from videos, expanding data sources for character animation.
Findings
Controllers are robust to perturbations.
Method can learn a broad range of skills.
Can predict human motions from still images.
Abstract
Data-driven character animation based on motion capture can produce highly naturalistic behaviors and, when combined with physics simulation, can provide for natural procedural responses to physical perturbations, environmental changes, and morphological discrepancies. Motion capture remains the most popular source of motion data, but collecting mocap data typically requires heavily instrumented environments and actors. In this paper, we propose a method that enables physically simulated characters to learn skills from videos (SFV). Our approach, based on deep pose estimation and deep reinforcement learning, allows data-driven animation to leverage the abundance of publicly available video clips from the web, such as those from YouTube. This has the potential to enable fast and easy design of character controllers simply by querying for video recordings of the desired behavior. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Human Pose and Action Recognition · Diversity and Impact of Dance
