Neural Eigenfunctions Are Structured Representation Learners
Zhijie Deng, Jiaxin Shi, Hao Zhang, Peng Cui, Cewu Lu, Jun Zhu

TL;DR
This paper presents Neural Eigenmap, a neural network-based spectral method that produces structured, adaptive-length representations, improving efficiency and effectiveness in image retrieval and graph node classification tasks.
Contribution
It introduces Neural Eigenmap, a parametric spectral method that generates ordered, structured representations, enabling shorter codes with comparable performance to existing self-supervised methods.
Findings
Achieves up to 16x shorter representations in image retrieval.
Demonstrates strong results on large-scale graph node classification.
Shows that NeuralEF aligns with self-supervised learning objectives.
Abstract
This paper introduces a structured, adaptive-length deep representation called Neural Eigenmap. Unlike prior spectral methods such as Laplacian Eigenmap that operate in a nonparametric manner, Neural Eigenmap leverages NeuralEF to parametrically model eigenfunctions using a neural network. We show that, when the eigenfunction is derived from positive relations in a data augmentation setup, applying NeuralEF results in an objective function that resembles those of popular self-supervised learning methods, with an additional symmetry-breaking property that leads to \emph{structured} representations where features are ordered by importance. We demonstrate using such representations as adaptive-length codes in image retrieval systems. By truncation according to feature importance, our method requires up to shorter representation length than leading self-supervised learning ones…
Peer Reviews
Decision·Submitted to ICLR 2024
- Unsupervised representation learning is an interesting problem, while the proposed approach extends previous works. - The technical part of the paper seems to be correct, but I have not checked in detail all the theoretical results. - The proposed methods perform better in some of the experiments.
- The paper is ok written, but there are some parts that need improvement. For example, it is not entirely clear which methods the authors propose and how they differ from previous works. - Regarding the novelty, it is unclear to me what is the difference between the proposed approaches compared to previous works, which makes hard to understand the actual contributions. - In some of the experiments, it seems that the improvement is not significant.
- Through this work, the authors provide additional arguments for learning the principle eigenfunctions for unsupervised representation learning by considering the integral operator of a pre-defined kernel and the data distribution, which other lines of work in this area have argued to be at the core of many machine learning problems. - Over pre-existing work such as Johnson et al and Haochen et al, Neural Eigenmap shows better scalability with the number of eigenfunctions, and learning of an or
- Maybe not a weakness, but in this kind of learning, how do we know which kernel to pre-define? * With respect to the linear probe experiments for unsupervised representation learning, a few questions- * Why have the authors reproduced Barlow Twins themselves when, if I am not mistaken, the top-1 accuracy for ImageNet with a ResNet-50 pretrained encoder is available (like it is for SCL, which the authors use)? * Can you make any inferential comments on the link between batch size, numb
The paper leverages a recently proposed approach for eigenfunction estimation in a commonly used application of eigendecompositions for learning. The reasoning is straightforward and the results are compelling.
Some of the choices made are not clearly explained. For example, the choice of enabling/disabling stop_grad is said to allow/not allow for ordered eigenfunctions/structured representations, but there is no discussion of this (e.g., what does stop_grad do). This is also implied in the choice of elements with small indices vs. random elements being chosen when evaluating these two approaches. The proposed method is similar in Formulation to Barlow Twins, but the numerical comparison is focused on
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications
MethodsContrastive Learning
