Dynamic Concepts Personalization from Single Videos

Rameen Abdal; Or Patashnik; Ivan Skorokhodov; Willi Menapace,; Aliaksandr Siarohin; Sergey Tulyakov; Daniel Cohen-Or; Kfir Aberman

arXiv:2502.14844·cs.GR·February 21, 2025

Dynamic Concepts Personalization from Single Videos

Rameen Abdal, Or Patashnik, Ivan Skorokhodov, Willi Menapace,, Aliaksandr Siarohin, Sergey Tulyakov, Daniel Cohen-Or, Kfir Aberman

PDF

Open Access

TL;DR

This paper introduces Set-and-Sequence, a novel framework for personalizing text-to-video models to capture dynamic concepts by learning appearance and motion through a two-stage fine-tuning process, enabling better editability and compositionality.

Contribution

It proposes a new spatio-temporal weight space for DiT-based models, allowing dynamic concept personalization from single videos with a two-stage LoRA fine-tuning approach.

Findings

01

Effective embedding of dynamic concepts into video models.

02

Enhanced editability and compositionality of personalized videos.

03

Sets a new benchmark for dynamic concept personalization.

Abstract

Personalizing generative text-to-image models has seen remarkable progress, but extending this personalization to text-to-video models presents unique challenges. Unlike static concepts, personalizing text-to-video models has the potential to capture dynamic concepts, i.e., entities defined not only by their appearance but also by their motion. In this paper, we introduce Set-and-Sequence, a novel framework for personalizing Diffusion Transformers (DiTs)-based generative video models with dynamic concepts. Our approach imposes a spatio-temporal weight space within an architecture that does not explicitly separate spatial and temporal features. This is achieved in two key stages. First, we fine-tune Low-Rank Adaptation (LoRA) layers using an unordered set of frames from the video to learn an identity LoRA basis that represents the appearance, free from temporal interference. In the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization

MethodsDiffusion · Sparse Evolutionary Training