TL;DR
This paper introduces a memory-efficient text classification model compression technique based on product quantization, achieving significant size reduction with minimal accuracy loss, outperforming existing methods in memory-accuracy trade-offs.
Contribution
It adapts product quantization for word embeddings to produce compact models that maintain high accuracy, surpassing previous approaches in memory efficiency.
Findings
Requires 100x less memory than fastText
Maintains comparable accuracy with slight degradation
Outperforms state-of-the-art in memory-accuracy balance
Abstract
We consider the problem of producing compact architectures for text classification, such that the full model fits in a limited amount of memory. After considering different solutions inspired by the hashing literature, we propose a method built upon product quantization to store word embeddings. While the original technique leads to a loss in accuracy, we adapt this method to circumvent quantization artefacts. Our experiments carried out on several benchmarks show that our approach typically requires two orders of magnitude less memory than fastText while being only slightly inferior with respect to accuracy. As a result, it outperforms the state of the art by a good margin in terms of the compromise between memory usage and accuracy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗julien-c/fasttext-language-idmodel· 2.7k dl· ♡ 42.7k dl♡ 4
- 🤗facebook/fasttext-language-identificationmodel· 305k dl· ♡ 258305k dl♡ 258
- 🤗facebook/fasttext-en-vectorsmodel· 451 dl· ♡ 18451 dl♡ 18
- 🤗facebook/fasttext-ko-vectorsmodel· 19 dl· ♡ 1019 dl♡ 10
- 🤗facebook/fasttext-af-vectorsmodel· 2 dl2 dl
- 🤗facebook/fasttext-sq-vectorsmodel· 9 dl· ♡ 19 dl♡ 1
- 🤗facebook/fasttext-als-vectorsmodel· 2 dl2 dl
- 🤗facebook/fasttext-am-vectorsmodel· 2 dl2 dl
- 🤗facebook/fasttext-ar-vectorsmodel· 9 dl· ♡ 69 dl♡ 6
- 🤗facebook/fasttext-an-vectorsmodel· 3 dl3 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsfastText
