ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation
Jiazheng Xu, Xiao Liu, Yuchen Wu, Yuxuan Tong, Qinkai Li, Ming Ding,, Jie Tang, Yuxiao Dong

TL;DR
This paper introduces ImageReward, a human preference reward model for text-to-image generation, and proposes Reward Feedback Learning (ReFL) to optimize models based on human feedback, improving evaluation and generation quality.
Contribution
The paper develops ImageReward, the first general-purpose human preference reward model for text-to-image tasks, and introduces ReFL, a novel tuning method leveraging human preference feedback.
Findings
ImageReward outperforms existing scoring models in human evaluation.
ReFL improves diffusion model performance based on human preferences.
Code and datasets are publicly available for reproducibility.
Abstract
We present a comprehensive solution to learn and improve text-to-image models from human preference feedback. To begin with, we build ImageReward -- the first general-purpose text-to-image human preference reward model -- to effectively encode human preferences. Its training is based on our systematic annotation pipeline including rating and ranking, which collects 137k expert comparisons to date. In human evaluation, ImageReward outperforms existing scoring models and metrics, making it a promising automatic metric for evaluating text-to-image synthesis. On top of it, we propose Reward Feedback Learning (ReFL), a direct tuning algorithm to optimize diffusion models against a scorer. Both automatic and human evaluation support ReFL's advantages over compared methods. All code and datasets are provided at \url{https://github.com/THUDM/ImageReward}.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Artificial Intelligence in Games
MethodsDiffusion
