Enhancing VICReg: Random-Walk Pairing for Improved Generalization and Better Global Semantics Capturing

Idan Simai; Ronen Talmon; Uri Shaham

arXiv:2506.18104·cs.CV·June 24, 2025

Enhancing VICReg: Random-Walk Pairing for Improved Generalization and Better Global Semantics Capturing

Idan Simai, Ronen Talmon, Uri Shaham

PDF

TL;DR

This paper introduces SAG-VICReg, an improved self-supervised learning method that enhances global semantic capturing and generalization by building on VICReg, supported by new training techniques and a novel evaluation metric.

Contribution

It proposes SAG-VICReg, a novel extension of VICReg, with training enhancements and a new embedding evaluation metric for better global semantics and robustness.

Findings

01

SAG-VICReg outperforms existing SSL methods on global semantic metrics.

02

The method maintains competitive local evaluation results.

03

The new metric effectively assesses global data structure without labels.

Abstract

In this paper, we argue that viewing VICReg-a popular self-supervised learning (SSL) method--through the lens of spectral embedding reveals a potential source of sub-optimality: it may struggle to generalize robustly to unseen data due to overreliance on the training data. This observation invites a closer look at how well this method achieves its goal of producing meaningful representations of images outside of the training set as well. Here, we investigate this issue and introduce SAG-VICReg (Stable and Generalizable VICReg), a method that builds on VICReg by incorporating new training techniques. These enhancements improve the model's ability to capture global semantics within the data and strengthen the generalization capabilities. Experiments demonstrate that SAG-VICReg effectively addresses the generalization challenge while matching or surpassing diverse state-of-the-art SSL…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.