Beyond Thumbs Up/Down: Untangling Challenges of Fine-Grained Feedback   for Text-to-Image Generation

Katherine M. Collins; Najoung Kim; Yonatan Bitton; Verena Rieser,; Shayegan Omidshafiei; Yushi Hu; Sherol Chen; Senjuti Dutta; Minsuk Chang,; Kimin Lee; Youwei Liang; Georgina Evans; Sahil Singla; Gang Li; Adrian; Weller; Junfeng He; Deepak Ramachandran; Krishnamurthy Dj Dvijotham

arXiv:2406.16807·cs.LG·October 18, 2024

Beyond Thumbs Up/Down: Untangling Challenges of Fine-Grained Feedback for Text-to-Image Generation

Katherine M. Collins, Najoung Kim, Yonatan Bitton, Verena Rieser,, Shayegan Omidshafiei, Yushi Hu, Sherol Chen, Senjuti Dutta, Minsuk Chang,, Kimin Lee, Youwei Liang, Georgina Evans, Sahil Singla, Gang Li, Adrian, Weller, Junfeng He, Deepak Ramachandran

PDF

Open Access

TL;DR

This paper examines the potential and limitations of fine-grained human feedback for improving reward models in text-to-image generation, highlighting complexities and conditions under which it outperforms coarse feedback.

Contribution

It provides an empirical analysis of fine-grained versus coarse feedback, revealing challenges and conditions affecting their effectiveness in reward modeling for text-to-image tasks.

Findings

01

Fine-grained feedback can worsen models with limited budgets in some cases.

02

In controlled settings, fine-grained rewards outperform coarse feedback.

03

Model choice and feedback alignment critically influence outcomes.

Abstract

Human feedback plays a critical role in learning and refining reward models for text-to-image generation, but the optimal form the feedback should take for learning an accurate reward function has not been conclusively established. This paper investigates the effectiveness of fine-grained feedback which captures nuanced distinctions in image quality and prompt-alignment, compared to traditional coarse-grained feedback (for example, thumbs up/down or ranking between a set of options). While fine-grained feedback holds promise, particularly for systems catering to diverse societal preferences, we show that demonstrating its superiority to coarse-grained feedback is not automatic. Through experiments on real and synthetic preference data, we surface the complexities of building effective models due to the interplay of model choice, feedback type, and the alignment between human judgment…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational Physics and Python Applications · Video Analysis and Summarization

MethodsSparse Evolutionary Training