Now It Sounds Like You: Learning Personalized Vocabulary On Device
Sid Wang, Ashish Shenoy, Pierce Chuang, John Nguyen

TL;DR
This paper introduces a personalized federated learning approach with an 'OOV expansion' technique that enhances on-device language models' ability to handle out-of-vocabulary words, improving accuracy while respecting device constraints.
Contribution
It presents a novel 'OOV expansion' method with a personalized 'OOV adapter' for better OOV handling in federated NLP models, addressing memory and latency limitations.
Findings
OOV expansion outperforms standard personalization methods.
The approach improves OOV coverage and model accuracy.
Minimal impact on device memory and latency.
Abstract
In recent years, Federated Learning (FL) has shown significant advancements in its ability to perform various natural language processing (NLP) tasks. This work focuses on applying personalized FL for on-device language modeling. Due to limitations of memory and latency, these models cannot support the complexity of sub-word tokenization or beam search decoding, resulting in the decision to deploy a closed-vocabulary language model. However, closed-vocabulary models are unable to handle out-of-vocabulary (OOV) words belonging to specific users. To address this issue, We propose a novel technique called "OOV expansion" that improves OOV coverage and increases model accuracy while minimizing the impact on memory and latency. This method introduces a personalized "OOV adapter" that effectively transfers knowledge from a central model and learns word embedding for personalized vocabulary.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Text Readability and Simplification · Speech Recognition and Synthesis
