Describing Sets of Images with Textual-PCA

Oded Hupert; Idan Schwartz; Lior Wolf

arXiv:2210.12112·cs.CV·October 24, 2022

Describing Sets of Images with Textual-PCA

Oded Hupert, Idan Schwartz, Lior Wolf

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method called Textual-PCA that semantically describes image sets by generating phrases that capture both the common attributes and variations within the set, using pretrained vision-language models.

Contribution

It proposes a novel approach replacing PCA projection vectors with generated phrases to describe image sets semantically, capturing both central themes and variations.

Findings

01

Effectively captures the essence of image sets.

02

Generates meaningful descriptions of individual images.

03

Uses pretrained models for semantic similarity and variation analysis.

Abstract

We seek to semantically describe a set of images, capturing both the attributes of single images and the variations within the set. Our procedure is analogous to Principle Component Analysis, in which the role of projection vectors is replaced with generated phrases. First, a centroid phrase that has the largest average semantic similarity to the images in the set is generated, where both the computation of the similarity and the generation are based on pretrained vision-language models. Then, the phrase that generates the highest variation among the similarity scores is generated, using the same models. The next phrase maximizes the variance subject to being orthogonal, in the latent space, to the highest-variance phrase, and the process continues. Our experiments show that our method is able to convincingly capture the essence of image sets and describe the individual elements in a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

odedh/textual-pca
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications