Learning Human Skill Generators at Key-Step Levels
Yilu Wu, Chenhui Zhu, Shuai Wang, Hanlin Wang, Jing Wang, Zhaoxiang, Zhang, Limin Wang

TL;DR
This paper introduces a new task called Key-step Skill Generation (KS-Gen) to generate key steps of human skills in videos, along with a dataset and a multi-stage framework to improve the synthesis of complex, multi-step human actions.
Contribution
The paper proposes the KS-Gen task, a novel framework combining multimodal language models, key-step image generation, and video synthesis to generate human skill videos more effectively.
Findings
The framework improves temporal consistency in skill video generation.
A new dataset and evaluation metrics for key-step skill generation are introduced.
Results demonstrate better handling of complex, multi-step human actions.
Abstract
We are committed to learning human skill generators at key-step levels. The generation of skills is a challenging endeavor, but its successful implementation could greatly facilitate human skill learning and provide more experience for embodied intelligence. Although current video generation models can synthesis simple and atomic human operations, they struggle with human skills due to their complex procedure process. Human skills involve multi-step, long-duration actions and complex scene transitions, so the existing naive auto-regressive methods for synthesizing long videos cannot generate human skills. To address this, we propose a novel task, the Key-step Skill Generation (KS-Gen), aimed at reducing the complexity of generating human skill videos. Given the initial state and a skill description, the task is to generate video clips of key steps to complete the skill, rather than a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Resource Development and Performance Evaluation · AI and HR Technologies · Cognitive Science and Mapping
