Franca: Nested Matryoshka Clustering for Scalable Visual Representation Learning

Shashanka Venkataramanan; Valentinos Pariza; Mohammadreza Salehi; Lukas Knobel; Spyros Gidaris; Elias Ramzi; Andrei Bursuc; Yuki M. Asano

arXiv:2507.14137·cs.CV·April 28, 2026

Franca: Nested Matryoshka Clustering for Scalable Visual Representation Learning

Shashanka Venkataramanan, Valentinos Pariza, Mohammadreza Salehi, Lukas Knobel, Spyros Gidaris, Elias Ramzi, Andrei Bursuc, Yuki M. Asano

PDF

1 Repo 1 Models

TL;DR

Franca is an open-source vision foundation model that surpasses proprietary models using nested Matryoshka clustering and positional disentanglement, enabling scalable, efficient, and transparent visual representation learning.

Contribution

It introduces a novel multi-head clustering projector with nested Matryoshka representations and a positional disentanglement strategy, advancing open, high-performance vision models.

Findings

01

Matches or surpasses state-of-the-art proprietary models

02

Improves downstream benchmark performance with cleaner features

03

Offers a scalable, memory-efficient clustering approach

Abstract

We present Franca (pronounced Fran-ka): free one; the first fully open-source (data, code, weights) vision foundation model that matches and in many cases surpasses the performance of state-of-the-art proprietary models, e.g., DINOv2, CLIP, SigLIPv2, etc. Our approach is grounded in a transparent training pipeline inspired by Web-SSL and uses publicly available data: ImageNet-21K and a subset of ReLAION-2B. Beyond model release, we tackle critical limitations in SSL clustering methods. While modern models rely on assigning image features to large codebooks via clustering algorithms like Sinkhorn-Knopp, they fail to account for the inherent ambiguity in clustering semantics. To address this, we introduce a parameter-efficient, multi-head clustering projector based on nested Matryoshka representations. This design progressively refines features into increasingly fine-grained clusters…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

valeoai/Franca
github

Models

🤗
birder-project/vit_b16_ls_franca-bioscan5m
model· 219 dl· ♡ 1
219 dl♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.