Automated Filtering of Human Feedback Data for Aligning Text-to-Image Diffusion Models
Yongjin Yang, Sihyeon Kim, Hojung Jung, Sangmin Bae, SangMook Kim,, Se-Young Yun, Kimin Lee

TL;DR
FiFA is an automated data filtering method that improves the efficiency and effectiveness of fine-tuning text-to-image diffusion models with human feedback, reducing data requirements and enhancing model alignment.
Contribution
The paper introduces FiFA, a novel automated filtering algorithm that optimizes data selection for human feedback-based model fine-tuning, incorporating preference margin, text quality, and diversity.
Findings
Significantly improves training stability and performance.
Achieves 17% human preference with less than 0.5% data.
Reduces GPU hours to 1% of full dataset training.
Abstract
Fine-tuning text-to-image diffusion models with human feedback is an effective method for aligning model behavior with human intentions. However, this alignment process often suffers from slow convergence due to the large size and noise present in human feedback datasets. In this work, we propose FiFA, a novel automated data filtering algorithm designed to enhance the fine-tuning of diffusion models using human feedback datasets with direct preference optimization (DPO). Specifically, our approach selects data by solving an optimization problem to maximize three components: preference margin, text quality, and text diversity. The concept of preference margin is used to identify samples that are highly informative in addressing the noisy nature of feedback dataset, which is calculated using a proxy reward model. Additionally, we incorporate text quality, assessed by large language models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSpeech Recognition and Synthesis · Computational and Text Analysis Methods · Advanced Data Compression Techniques
MethodsDiffusion
