May the Dance be with You: Dance Generation Framework for Non-Humanoids
Hyemin Ahn

TL;DR
This paper introduces a novel framework enabling non-humanoid agents to learn dance movements from human videos by aligning optical flow with music through reinforcement learning and contrastive feature encoding.
Contribution
It presents the first framework for non-humanoid dance generation from human videos, utilizing a reward model based on contrastive learning of optical flow and music features.
Findings
Generated dances align well with music beats
The framework outperforms baseline methods in human preference
Optical flow effectively captures visual rhythm for dance synthesis
Abstract
We hypothesize dance as a motion that forms a visual rhythm from music, where the visual rhythm can be perceived from an optical flow. If an agent can recognize the relationship between visual rhythm and music, it will be able to dance by generating a motion to create a visual rhythm that matches the music. Based on this, we propose a framework for any kind of non-humanoid agents to learn how to dance from human videos. Our framework works in two processes: (1) training a reward model which perceives the relationship between optical flow (visual rhythm) and music from human dance videos, (2) training the non-humanoid dancer based on that reward model, and reinforcement learning. Our reward model consists of two feature encoders for optical flow and music. They are trained based on contrastive learning which makes the higher similarity between concurrent optical flow and music features.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Locomotion and Control · Human Motion and Animation · Reinforcement Learning in Robotics
MethodsALIGN · Contrastive Learning
