Deep Networks With Large Output Spaces

Sudheendra Vijayanarasimhan; Jonathon Shlens; Rajat Monga and; Jay Yagnik

arXiv:1412.7479·cs.NE·April 13, 2015·ICLR·31 cites

Deep Networks With Large Output Spaces

Sudheendra Vijayanarasimhan, Jonathon Shlens, Rajat Monga and, Jay Yagnik

PDF

Open Access

TL;DR

This paper introduces a locality-sensitive hashing method to efficiently approximate dot products in deep neural networks, enabling scalable training and inference for models with millions of output classes.

Contribution

It presents a novel hashing technique that significantly accelerates training and inference in large output space neural networks, addressing computational challenges.

Findings

01

Faster training compared to baseline methods

02

Effective approximation of dot products in large-scale models

03

Successful application to diverse recognition tasks

Abstract

Deep neural networks have been extremely successful at various image, speech, video recognition tasks because of their ability to model deep structures within the data. However, they are still prohibitively expensive to train and apply for problems containing millions of classes in the output layer. Based on the observation that the key computation common to most neural network layers is a vector/matrix product, we propose a fast locality-sensitive hashing technique to approximate the actual dot product enabling us to scale up the training and inference to millions of output classes. We evaluate our technique on three diverse large-scale recognition tasks and show that our approach can train large-scale models at a faster rate (in terms of steps/total time) compared to baseline methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Advanced Neural Network Applications · Multimodal Machine Learning Applications