Similarity Learning for High-Dimensional Sparse Data
Kuan Liu, Aur\'elien Bellet, Fei Sha

TL;DR
This paper introduces an efficient similarity learning method for high-dimensional sparse data that leverages a convex combination of rank-one matrices and an approximate Frank-Wolfe algorithm, improving scalability and reducing overfitting.
Contribution
It proposes a novel scalable similarity learning approach that optimizes a convex combination of sparse rank-one matrices using an approximate Frank-Wolfe method, suitable for high-dimensional data.
Findings
Effective on real-world high-dimensional datasets
Reduces overfitting by controlling active features
Demonstrates potential for classification and data exploration
Abstract
A good measure of similarity between data points is crucial to many tasks in machine learning. Similarity and metric learning methods learn such measures automatically from data, but they do not scale well respect to the dimensionality of the data. In this paper, we propose a method that can learn efficiently similarity measure from high-dimensional sparse data. The core idea is to parameterize the similarity measure as a convex combination of rank-one matrices with specific sparsity structures. The parameters are then optimized with an approximate Frank-Wolfe procedure to maximally satisfy relative similarity constraints on the training data. Our algorithm greedily incorporates one pair of features at a time into the similarity measure, providing an efficient way to control the number of active features and thus reduce overfitting. It enjoys very appealing convergence guarantees and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Sparse and Compressive Sensing Techniques · Advanced Image and Video Retrieval Techniques
