Fast Nearest-Neighbor Classification using RNN in Domains with Large Number of Classes
Gautam Singh, Gargi Dasgupta, Yu Deng

TL;DR
This paper introduces a hybrid cascaded approach combining a fast RNN and a slow nearest-neighbor method to efficiently classify text into thousands of classes with limited training data, improving speed and accuracy.
Contribution
The paper proposes a novel cascaded system that significantly reduces query time while enhancing accuracy in large-scale text classification tasks with few training samples per class.
Findings
Query time reduced to one-sixth of the original.
Outperforms LSH-based baseline in speed.
Provides a lower bound on cascaded model accuracy.
Abstract
In scenarios involving text classification where the number of classes is large (in multiples of 10000s) and training samples for each class are few and often verbose, nearest neighbor methods are effective but very slow in computing a similarity score with training samples of every class. On the other hand, machine learning models are fast at runtime but training them adequately is not feasible using few available training samples per class. In this paper, we propose a hybrid approach that cascades 1) a fast but less-accurate recurrent neural network (RNN) model and 2) a slow but more-accurate nearest-neighbor model using bag of syntactic features. Using the cascaded approach, our experiments, performed on data set from IT support services where customer complaint text needs to be classified to return top- possible error codes, show that the query-time of the slow system is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Text and Document Classification Technologies · Machine Learning and Data Classification
