Handling Compounding in Mobile Keyboard Input
Andreas Kabel, Keith Hall, Tom Ouyang, David Rybach, Daan van Esch,, Fran\c{c}oise Beaufays

TL;DR
This paper introduces a framework using binding types and subword units to improve mobile keyboard input in morphologically rich languages, significantly reducing word error rates.
Contribution
The paper presents a novel approach with binding types and subword units to handle compounding in languages, achieving substantial error rate reductions.
Findings
Approximately 20% word error rate reduction across languages
More than double the improvement over previous basic methods
Effective handling of compounding in morphologically rich languages
Abstract
This paper proposes a framework to improve the typing experience of mobile users in morphologically rich languages. Smartphone keyboards typically support features such as input decoding, corrections and predictions that all rely on language models. For latency reasons, these operations happen on device, so the models are of limited size and cannot easily cover all the words needed by users for their daily tasks, especially in morphologically rich languages. In particular, the compounding nature of Germanic languages makes their vocabulary virtually infinite. Similarly, heavily inflecting and agglutinative languages (e.g. Slavic, Turkic or Finno-Ugric languages) tend to have much larger vocabularies than morphologically simpler languages, such as English or Mandarin. We propose to model such languages with automatically selected subword units annotated with what we call binding types,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimedia Communication and Technology · Speech and dialogue systems · Usability and User Interface Design
