On the rankability of visual embeddings

Ankit Sonthalia; Arnas Uselis; Seong Joon Oh

arXiv:2507.03683·cs.CV·July 8, 2025

On the rankability of visual embeddings

Ankit Sonthalia, Arnas Uselis, Seong Joon Oh

PDF

TL;DR

This paper investigates whether visual embeddings encode ordinal attributes along linear directions, finding many embeddings are inherently rankable and can be used for image ranking with minimal supervision.

Contribution

It introduces the concept of rankability in visual embeddings and demonstrates that simple methods can recover meaningful rank axes across various models and datasets.

Findings

01

Many embeddings are inherently rankable.

02

Few samples or two extremes suffice to find rank axes.

03

Rankable embeddings enable new image ranking applications.

Abstract

We study whether visual embedding models capture continuous, ordinal attributes along linear directions, which we term _rank axes_. We define a model as _rankable_ for an attribute if projecting embeddings onto such an axis preserves the attribute's order. Across 7 popular encoders and 9 datasets with attributes like age, crowd count, head pose, aesthetics, and recency, we find that many embeddings are inherently rankable. Surprisingly, a small number of samples, or even just two extreme examples, often suffice to recover meaningful rank axes, without full-scale supervision. These findings open up new use cases for image ranking in vector databases and motivate further study into the structure and learning of rankable embeddings. Our code is available at https://github.com/aktsonthalia/rankable-vision-embeddings.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.