TrackGo: A Flexible and Efficient Method for Controllable Video   Generation

Haitao Zhou; Chuang Wang; Rui Nie; Jinlin Liu; Dongdong Yu; Qian Yu,; Changhu Wang

arXiv:2408.11475·cs.CV·January 7, 2025

TrackGo: A Flexible and Efficient Method for Controllable Video Generation

Haitao Zhou, Chuang Wang, Rui Nie, Jinlin Liu, Dongdong Yu, Qian Yu,, Changhu Wang

PDF

Open Access 1 Video

TL;DR

TrackGo introduces a flexible, efficient method for controllable video generation using free-form masks and arrows, with a lightweight adapter that enhances control precision and achieves state-of-the-art results.

Contribution

The paper presents TrackGo, a novel controllable video generation framework utilizing free-form masks, arrows, and the TrackAdapter for improved control and efficiency.

Findings

01

Achieves state-of-the-art FVD, FID, and ObjMC scores.

02

Provides precise control over complex video scenarios.

03

Introduces a lightweight, seamless control adapter.

Abstract

Recent years have seen substantial progress in diffusion-based controllable video generation. However, achieving precise control in complex scenarios, including fine-grained object parts, sophisticated motion trajectories, and coherent background movement, remains a challenge. In this paper, we introduce TrackGo, a novel approach that leverages free-form masks and arrows for conditional video generation. This method offers users with a flexible and precise mechanism for manipulating video content. We also propose the TrackAdapter for control implementation, an efficient and lightweight adapter designed to be seamlessly integrated into the temporal self-attention layers of a pretrained video generation model. This design leverages our observation that the attention map of these layers can accurately activate regions corresponding to motion in videos. Our experimental results demonstrate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

TrackGo: A Flexible and Efficient Method for Controllable Video Generation· underline

Taxonomy

TopicsVideo Analysis and Summarization

MethodsSoftmax · Attention Is All You Need · Adapter