Constrained Mean Shift for Representation Learning
Ajinkya Tejankar, Soroush Abbasi Koohpayegani, and Hamed Pirsiavash

TL;DR
This paper introduces a constrained mean-shift method for representation learning that leverages additional knowledge to produce more semantically meaningful embeddings, improving transfer performance and robustness to noise.
Contribution
It generalizes mean-shift with constraints from labels or other modalities, enhancing non-contrastive learning for better representations.
Findings
Improves transfer performance on ImageNet-1k.
Shows robustness to label noise.
Enables cross-modal self-supervised video training.
Abstract
We are interested in representation learning from labeled or unlabeled data. Inspired by recent success of self-supervised learning (SSL), we develop a non-contrastive representation learning method that can exploit additional knowledge. This additional knowledge may come from annotated labels in the supervised setting or an SSL model from another modality in the SSL setting. Our main idea is to generalize the mean-shift algorithm by constraining the search space of nearest neighbors, resulting in semantically purer representations. Our method simply pulls the embedding of an instance closer to its nearest neighbors in a search space that is constrained using the additional knowledge. By leveraging this non-contrastive loss, we show that the supervised ImageNet-1k pretraining with our method results in better transfer performance as compared to the baselines. Further, we demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Human Pose and Action Recognition
