PRSM: A Measure to Evaluate CLIP's Robustness Against Paraphrases

Udo Schlegel; Franziska Weeber; Jian Lan; Thomas Seidl

arXiv:2511.11141·cs.CL·November 17, 2025

PRSM: A Measure to Evaluate CLIP's Robustness Against Paraphrases

Udo Schlegel, Franziska Weeber, Jian Lan, Thomas Seidl

PDF

Open Access

TL;DR

This paper introduces PRSM, a new metric to evaluate CLIP's robustness to paraphrasing, revealing variability in stability across paraphrasing strategies and demographic biases.

Contribution

The paper proposes PRSM, a novel measure for assessing CLIP's sensitivity to paraphrased inputs, and empirically analyzes its robustness and bias using the Social Counterfactuals dataset.

Findings

01

Robustness varies across different paraphrasing strategies.

02

Subtle differences in robustness are observed between gender-associated queries.

03

CLIP's stability is influenced by paraphrasing and demographic factors.

Abstract

Contrastive Language-Image Pre-training (CLIP) is a widely used multimodal model that aligns text and image representations through large-scale training. While it performs strongly on zero-shot and few-shot tasks, its robustness to linguistic variation, particularly paraphrasing, remains underexplored. Paraphrase robustness is essential for reliable deployment, especially in socially sensitive contexts where inconsistent representations can amplify demographic biases. In this paper, we introduce the Paraphrase Ranking Stability Metric (PRSM), a novel measure for quantifying CLIP's sensitivity to paraphrased queries. Using the Social Counterfactuals dataset, a benchmark designed to reveal social and demographic biases, we empirically assess CLIP's stability under paraphrastic variation, examine the interaction between paraphrase robustness and gender, and discuss implications for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Domain Adaptation and Few-Shot Learning