Efficient Nearest Neighbor Language Models
Junxian He, Graham Neubig, Taylor Berg-Kirkpatrick

TL;DR
This paper enhances the efficiency of non-parametric neural language models by introducing methods that significantly speed up inference while maintaining performance, facilitating their practical deployment.
Contribution
The paper proposes methods to improve the inference efficiency of k-nearest neighbors language models, achieving up to 6x speed-up without performance loss.
Findings
Up to 6x faster inference speed on WikiText-103.
Maintained comparable language modeling performance.
Guidelines for developing efficient non-parametric NLMs.
Abstract
Non-parametric neural language models (NLMs) learn predictive distributions of text utilizing an external datastore, which allows them to learn through explicitly memorizing the training datapoints. While effective, these models often require retrieval from a large datastore at test time, significantly increasing the inference overhead and thus limiting the deployment of non-parametric NLMs in practical applications. In this paper, we take the recently proposed -nearest neighbors language model (Khandelwal et al., 2020) as an example, exploring methods to improve its efficiency along various dimensions. Experiments on the standard WikiText-103 benchmark and domain-adaptation datasets show that our methods are able to achieve up to a 6x speed-up in inference speed while retaining comparable performance. The empirical analysis we present may provide guidelines for future research…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
