ControlNeXt: Powerful and Efficient Control for Image and Video   Generation

Bohao Peng; Jian Wang; Yuechen Zhang; Wenbo Li; Ming-Chang Yang and; Jiaya Jia

arXiv:2408.06070·cs.CV·March 11, 2025

ControlNeXt: Powerful and Efficient Control for Image and Video Generation

Bohao Peng, Jian Wang, Yuechen Zhang, Wenbo Li, Ming-Chang Yang and, Jiaya Jia

PDF

Open Access 1 Repo 6 Models

TL;DR

ControlNeXt introduces a streamlined, resource-efficient approach for controllable image and video generation, significantly reducing training complexity and enabling seamless style modifications without extra training.

Contribution

It presents a simplified architecture with minimal additional costs, reduces learnable parameters by up to 90%, and introduces Cross Normalization for faster, stable training convergence.

Findings

01

Demonstrates robustness across various models and data types.

02

Achieves up to 90% reduction in learnable parameters.

03

Enables style alteration without additional training.

Abstract

Diffusion models have demonstrated remarkable and robust abilities in both image and video generation. To achieve greater control over generated results, researchers introduce additional architectures, such as ControlNet, Adapters and ReferenceNet, to integrate conditioning controls. However, current controllable generation methods often require substantial additional computational resources, especially for video generation, and face challenges in training or exhibit weak control. In this paper, we propose ControlNeXt: a powerful and efficient method for controllable image and video generation. We first design a more straightforward and efficient architecture, replacing heavy additional branches with minimal additional cost compared to the base model. Such a concise structure also allows our method to seamlessly integrate with other LoRA weights, enabling style alteration without the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dvlab-research/controlnext
pytorchOfficial

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Computer Graphics and Visualization Techniques · Medical Image Segmentation Techniques

MethodsBalanced Selection