FlexiAct: Towards Flexible Action Control in Heterogeneous Scenarios

Shiyi Zhang; Junhao Zhuang; Zhaoyang Zhang; Ying Shan; Yansong Tang

arXiv:2505.03730·cs.CV·May 7, 2025

FlexiAct: Towards Flexible Action Control in Heterogeneous Scenarios

Shiyi Zhang, Junhao Zhuang, Zhaoyang Zhang, Ying Shan, Yansong Tang

PDF

Open Access 1 Models 1 Datasets

TL;DR

FlexiAct introduces a novel approach for flexible action transfer in videos, allowing for variations in layout, viewpoint, and skeletal structure while maintaining identity, through innovative modules RefAdapter and FAE.

Contribution

The paper presents FlexiAct, a new method that enables adaptable action transfer across diverse subjects and scenarios, surpassing existing methods in flexibility and consistency.

Findings

01

Effective action transfer across diverse layouts and viewpoints.

02

RefAdapter outperforms existing spatial adaptation methods.

03

FAE enables direct action extraction during denoising.

Abstract

Action customization involves generating videos where the subject performs actions dictated by input control signals. Current methods use pose-guided or global motion customization but are limited by strict constraints on spatial structure, such as layout, skeleton, and viewpoint consistency, reducing adaptability across diverse subjects and scenarios. To overcome these limitations, we propose FlexiAct, which transfers actions from a reference video to an arbitrary target image. Unlike existing methods, FlexiAct allows for variations in layout, viewpoint, and skeletal structure between the subject of the reference video and the target image, while maintaining identity consistency. Achieving this requires precise action control, spatial structure adaptation, and consistency preservation. To this end, we introduce RefAdapter, a lightweight image-conditioned adapter that excels in spatial…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
shiyi0408/FlexiAct
model· ♡ 28
♡ 28

Datasets

shiyi0408/FlexiAct
dataset· 23 dl
23 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Stroke Rehabilitation and Recovery · Context-Aware Activity Recognition Systems

MethodsSoftmax · Attention Is All You Need · Adapter