Small Character Models Match Large Word Models for Autocomplete Under   Memory Constraints

Ganesh Jawahar; Subhabrata Mukherjee; Debadeepta Dey; Muhammad; Abdul-Mageed; Laks V.S. Lakshmanan; Caio Cesar Teodoro Mendes; Gustavo; Henrique de Rosa; Shital Shah

arXiv:2210.03251·cs.CL·June 9, 2023

Small Character Models Match Large Word Models for Autocomplete Under Memory Constraints

Ganesh Jawahar, Subhabrata Mukherjee, Debadeepta Dey, Muhammad, Abdul-Mageed, Laks V.S. Lakshmanan, Caio Cesar Teodoro Mendes, Gustavo, Henrique de Rosa, Shital Shah

PDF

Open Access

TL;DR

This paper demonstrates that small character-based language models can match larger word-based models in autocomplete accuracy under memory constraints, especially in open-domain scenarios, by leveraging novel techniques and inductive biases.

Contribution

The study shows character models can rival word models in accuracy for autocomplete tasks under limited memory, introducing methods to enhance character models with compositional bias and transfer learning.

Findings

01

Character models perform similarly to larger word models in accuracy.

02

Character models are more memory-efficient for edge devices.

03

Novel techniques improve character model performance.

Abstract

Autocomplete is a task where the user inputs a piece of text, termed prompt, which is conditioned by the model to generate semantically coherent continuation. Existing works for this task have primarily focused on datasets (e.g., email, chat) with high frequency user prompt patterns (or focused prompts) where word-based language models have been quite effective. In this work, we study the more challenging open-domain setting consisting of low frequency user prompt patterns (or broad prompts, e.g., prompt about 93rd academy awards) and demonstrate the effectiveness of character-based language models. We study this problem under memory-constrained settings (e.g., edge devices and smartphones), where character-based representation is effective in reducing the overall model size (in terms of parameters). We use WikiText-103 benchmark to simulate broad prompts and demonstrate that character…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification