Fast Adaptation with Bradley-Terry Preference Models in Text-To-Image   Classification and Generation

Victor Gallego

arXiv:2308.07929·cs.CV·September 22, 2023

Fast Adaptation with Bradley-Terry Preference Models in Text-To-Image Classification and Generation

Victor Gallego

PDF

Open Access

TL;DR

This paper introduces a fast adaptation method using Bradley-Terry preference models to personalize large multimodal models like CLIP and Stable Diffusion for specific human preferences with minimal data and computational resources.

Contribution

It develops a novel, efficient fine-tuning approach leveraging Bradley-Terry models to adapt multimodal models to individual preferences, requiring few examples and low computation.

Findings

01

Effective preference prediction as reward models

02

Improved image generation aligned with user preferences

03

Minimal data and computational requirements achieved

Abstract

Recently, large multimodal models, such as CLIP and Stable Diffusion have experimented tremendous successes in both foundations and applications. However, as these models increase in parameter size and computational requirements, it becomes more challenging for users to personalize them for specific tasks or preferences. In this work, we address the problem of adapting the previous models towards sets of particular human preferences, aligning the retrieved or generated images with the preferences of the user. We leverage the Bradley-Terry preference model to develop a fast adaptation method that efficiently fine-tunes the original model, with few examples and with minimal computing resources. Extensive evidence of the capabilities of this framework is provided through experiments in different domains related to multimodal text and image understanding, including preference prediction as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Text and Document Classification Technologies · Multimodal Machine Learning Applications

MethodsDiffusion · Contrastive Language-Image Pre-training