Aligning Visual Contrastive learning models via Preference Optimization

Amirabbas Afzali; Borna Khodabandeh; Ali Rasekh; Mahyar JafariNodeh,; Sepehr kazemi; Simon Gottschalk

arXiv:2411.08923·cs.CV·March 27, 2025

Aligning Visual Contrastive learning models via Preference Optimization

Amirabbas Afzali, Borna Khodabandeh, Ali Rasekh, Mahyar JafariNodeh,, Sepehr kazemi, Simon Gottschalk

PDF

Open Access 1 Repo 1 Video 3 Reviews

TL;DR

This paper introduces a novel preference optimization approach to enhance contrastive learning models, improving robustness, fairness, and alignment with human preferences, especially against typographic attacks and gender bias.

Contribution

The paper pioneers the application of preference optimization methods to contrastive learning, improving model robustness and bias mitigation beyond traditional techniques.

Findings

01

Models trained with preference optimization outperform standard contrastive models.

02

Enhanced robustness against typographic attacks demonstrated.

03

Improved disentanglement of gender concepts and bias reduction.

Abstract

Contrastive learning models have demonstrated impressive abilities to capture semantic similarities by aligning representations in the embedding space. However, their performance can be limited by the quality of the training data and its inherent biases. While Preference Optimization (PO) methods such as Reinforcement Learning from Human Feedback (RLHF) and Direct Preference Optimization (DPO) have been applied to align generative models with human preferences, their use in contrastive learning has yet to be explored. This paper introduces a novel method for training contrastive learning models using different PO methods to break down complex concepts. Our method systematically aligns model behavior with desired preferences, enhancing performance on the targeted task. In particular, we focus on enhancing model robustness against typographic attacks and inductive biases, commonly seen in…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 8Confidence 3

Strengths

- This paper appears to be the first paper to apply preference optimisation to contrastive models, and presents an interesting use of SVD to control model behaviour. - Optimising robustness and mitigating (gender) biases are of significant interest, especially in high-risk domains. - The evaluation results suggest comparable and often better performance than alternative approaches in improving robustness while enabling a (to some degree) interpretable intervention technique. - The paper is we

Weaknesses

- Despite improving robustness over baseline methods in some datasets, none of the methods consistently outperforms other methods (see Table 1). - The baseline methods, PAINT and Defense-prefix, and their differences to the proposed method are not explained in the paper. Minor Comments: - Line 23: Incomplete sentence „Our experiments We demonstrate“. - Line 256: Comma instead of dot used. - Line 258: Comma should be a dot, and dot should be a comma. - Line 289: „this“ -> „This“ - The di

Reviewer 02Rating 8Confidence 2

Strengths

Originality: This is the first work to improve contrastive learning models through Preference Optimization. The idea of leveraging true labels and typographic labels for preferences, instead of curating a separate preference set from human annotation, is novel and interesting. Clarity: This paper is well-written and has very clear motivations, backgrounds, methods, and experiments. Significance: The topic of aligning human preferences in contrastive learning is impactful, as models like CLIP

Weaknesses

Significance: this paper relies on a preference dataset, which requires heavy annotations and the preference set will be very small compared to the training set of CLIP. Also, the preference would be very task-specific (e.g., typographic or gender), limiting the generalizability of the approach to new, unseen attacks or biases. Quality: the inclusion of SVD makes it much slower to fine-tune on a larger scale. Also, the experiments focus on controlled, relatively smaller-scale datasets (the larg

Reviewer 03Rating 6Confidence 4

Strengths

1. The proposed method is simple yet effective. 2. The authors provide a new perspective on IPO and DPO concerning the representation space learned by CLIP. 3. The alignment controllability through $t$ is effective. 4. The background and motivation are well-organized.

Weaknesses

1. Clarity needs improvement. * $\mathcal{L}_{pref}$ in (10) appears without a definition. In Corollary 3.2, it is assumed to be either the DPO loss or IPO loss, while the experiments further include the case of KTO loss. * In (9), $\mathcal{I}_{ref}$ is frozen and has no trainable parameters, contributing solely to per-example weighting when substituted in (5), (6), and (7). It is recommended to clarify this in advance. * In Fig.1, $\mathcal{L}_{pref}$ is computed with the given tri

Code & Models

Repositories

amirabbas-afzali/aligning-visual-contrastive-learning-models-via-preference-optimization
pytorchOfficial

Videos

Aligning Visual Contrastive learning models via Preference Optimization· slideslive

Taxonomy

TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques · Constraint Satisfaction and Optimization

MethodsContrastive Learning · Focus · ALIGN · Contrastive Language-Image Pre-training · Parrot optimizer: Algorithm and applications to medical problems