Partial-Hessian Strategies for Fast Learning of Nonlinear Embeddings

Max Vladymyrov (UC Merced); Miguel Carreira-Perpinan (UC Merced)

arXiv:1206.4646·cs.LG·June 22, 2012·23 cites

Partial-Hessian Strategies for Fast Learning of Nonlinear Embeddings

Max Vladymyrov (UC Merced), Miguel Carreira-Perpinan (UC Merced)

PDF

Open Access

TL;DR

The paper introduces partial-Hessian optimization strategies for nonlinear embeddings, significantly speeding up training while maintaining quality, by leveraging spectral methods and graph Laplacians.

Contribution

It proposes a generic formulation of embedding algorithms, introduces spectral direction strategies, and demonstrates up to 100x speedup in training.

Findings

01

Up to two orders of magnitude speedup over existing methods

02

Spectral direction strategy is simple, scalable, and effective

03

The approach applies to various embedding algorithms

Abstract

Stochastic neighbor embedding (SNE) and related nonlinear manifold learning algorithms achieve high-quality low-dimensional representations of similarity data, but are notoriously slow to train. We propose a generic formulation of embedding algorithms that includes SNE and other existing algorithms, and study their relation with spectral methods and graph Laplacians. This allows us to define several partial-Hessian optimization strategies, characterize their global and local convergence, and evaluate them empirically. We achieve up to two orders of magnitude speedup over existing training methods with a strategy (which we call the spectral direction) that adds nearly no overhead to the gradient and yet is simple, scalable and applicable to several existing and future embedding algorithms.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Advanced Graph Neural Networks · Machine Learning and ELM