CRAFT: Aligning Diffusion Models with Fine-Tuning Is Easier Than You Think
Zening Sun, Zhengpeng Xie, Lichen Bai, Shitong Shao, Shuo Yang, and Zeke Xie

TL;DR
CRAFT is a novel fine-tuning approach for diffusion models that reduces data requirements and improves efficiency by using composite reward filtering and an enhanced supervised fine-tuning method, outperforming existing techniques.
Contribution
It introduces CRAFT, a lightweight fine-tuning paradigm that constructs high-quality datasets with less data and computational cost, and provides a theoretical link to reinforcement learning.
Findings
CRAFT with only 100 samples outperforms methods using thousands of preference pairs.
CRAFT achieves 11-220x faster convergence than baseline preference optimization methods.
Empirical results demonstrate CRAFT's superior performance and efficiency.
Abstract
Aligning Diffusion models has achieved remarkable breakthroughs in generating high-quality, human preference-aligned images. Existing techniques, such as supervised fine-tuning (SFT) and DPO-style preference optimization, have become principled tools for fine-tuning diffusion models. However, SFT relies on high-quality images that are costly to obtain, while DPO-style methods depend on large-scale preference datasets, which are often inconsistent in quality. Beyond data dependency, these methods are further constrained by computational inefficiency. To address these two challenges, we propose Composite Reward Assisted Fine-Tuning (CRAFT), a lightweight yet powerful fine-tuning paradigm that requires significantly reduced training data while maintaining computational efficiency. It first leverages a Composite Reward Filtering (CRF) technique to construct a high-quality and consistent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Video Quality Assessment · Stochastic Gradient Optimization Techniques · Generative Adversarial Networks and Image Synthesis
