Scendi Score: Prompt-Aware Diversity Evaluation via Schur Complement of CLIP Embeddings
Azim Ospanov, Mohammad Jalali, Farzan Farnia

TL;DR
This paper introduces the Scendi score, a novel prompt-aware diversity metric for text-to-image models based on Schur complement of CLIP embeddings, enabling quantification and interpretation of generated image diversity.
Contribution
The paper proposes a new diversity evaluation method using Schur complement of CLIP embeddings, extending CLIPScore to measure intrinsic diversity in prompt-guided generative models.
Findings
Scendi score effectively captures the intrinsic diversity of text-to-image models.
The method allows focus or defocus on specific objects in image embeddings.
Numerical results validate the Scendi score's ability to evaluate diversity.
Abstract
The use of CLIP embeddings to assess the fidelity of samples produced by text-to-image generative models has been extensively explored in the literature. While the widely adopted CLIPScore, derived from the cosine similarity of text and image embeddings, effectively measures the alignment of a generated image, it does not quantify the diversity of images generated by a text-to-image model. In this work, we extend the application of CLIP embeddings to quantify and interpret the intrinsic diversity of text-to-image models, which are responsible for generating diverse images from similar text prompts, which we refer to as prompt-aware diversity. To achieve this, we propose a decomposition of the CLIP-based kernel covariance matrix of image data into text-based and non-text-based components. Using the Schur complement of the joint image-text kernel covariance matrix, we perform this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGlycosylation and Glycoproteins Research · Carbohydrate Chemistry and Synthesis
MethodsContrastive Language-Image Pre-training · Focus
