Self-Similar Epochs: Value in Arrangement

Eliav Buchnik; Edith Cohen; Avinatan Hassidim; Yossi Matias

arXiv:1803.05389·cs.LG·June 20, 2019·1 cites

Self-Similar Epochs: Value in Arrangement

Eliav Buchnik, Edith Cohen, Avinatan Hassidim, Yossi Matias

PDF

Open Access

TL;DR

This paper proposes self-similar arrangements of training data for matrix factorization, which preserve data structure and accelerate training by 3-37%, offering a new enhancement to stochastic gradient descent.

Contribution

It introduces a novel data arrangement method that maintains data similarity structures, improving training efficiency in matrix factorization tasks.

Findings

01

Training acceleration of 3-37% observed.

02

Self-similar arrangements preserve data structure.

03

Method shows promise for enhancing SGD efficiency.

Abstract

Optimization of machine learning models is commonly performed through stochastic gradient updates on randomly ordered training examples. This practice means that sub-epochs comprise of independent random samples of the training data that may not preserve informative structure present in the full data. We hypothesize that the training can be more effective with {\em self-similar} arrangements that potentially allow each epoch to provide benefits of multiple ones. We study this for "matrix factorization" -- the common task of learning metric embeddings of entities such as queries, videos, or words from example pairwise associations. We construct arrangements that preserve the weighted Jaccard similarities of rows and columns and experimentally observe training acceleration of 3\%-37\% on synthetic and recommendation datasets. Principled arrangements of training examples emerge as a novel…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning and Data Classification · Text and Document Classification Technologies

MethodsStochastic Gradient Descent