Efficient Distance Metric Learning by Adaptive Sampling and Mini-Batch   Stochastic Gradient Descent (SGD)

Qi Qian; Rong Jin; Jinfeng Yi; Lijun Zhang; Shenghuo Zhu

arXiv:1304.1192·cs.LG·April 5, 2013

Efficient Distance Metric Learning by Adaptive Sampling and Mini-Batch Stochastic Gradient Descent (SGD)

Qi Qian, Rong Jin, Jinfeng Yi, Lijun Zhang, Shenghuo Zhu

PDF

Open Access

TL;DR

This paper introduces adaptive sampling and mini-batch strategies within stochastic gradient descent to significantly reduce the computational cost of distance metric learning, especially the expensive PSD projections.

Contribution

It develops novel SGD-based algorithms with theoretical guarantees that effectively lower the number of PSD projections needed in DML.

Findings

01

Reduced number of PSD projections in DML algorithms

02

Theoretical guarantees for adaptive sampling and mini-batch methods

03

Empirical validation showing improved efficiency and comparable accuracy

Abstract

Distance metric learning (DML) is an important task that has found applications in many domains. The high computational cost of DML arises from the large number of variables to be determined and the constraint that a distance metric has to be a positive semi-definite (PSD) matrix. Although stochastic gradient descent (SGD) has been successfully applied to improve the efficiency of DML, it can still be computationally expensive because in order to ensure that the solution is a PSD matrix, it has to, at every iteration, project the updated distance metric onto the PSD cone, an expensive operation. We address this challenge by developing two strategies within SGD, i.e. mini-batch and adaptive sampling, to effectively reduce the number of updates (i.e., projections onto the PSD cone) in SGD. We also develop hybrid approaches that combine the strength of adaptive sampling with that of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace and Expression Recognition · Domain Adaptation and Few-Shot Learning · Machine Learning and ELM

MethodsStochastic Gradient Descent