Real-time Neural-based Input Method
Jiali Yao, Raphael Shu, Xinjian Li, Katsutoshi Ohtsuki, Hideki, Nakayama

TL;DR
This paper introduces an efficient neural input method for Japanese that uses incremental selective softmax to achieve real-time performance on CPUs and significantly reduces model size.
Contribution
It proposes a novel incremental selective softmax technique to speed up neural input methods and demonstrates substantial model compression without accuracy loss.
Findings
Two orders of magnitude speedup in softmax computation
Real-time Japanese input conversion on commodity CPUs
92% model size reduction without accuracy loss
Abstract
The input method is an essential service on every mobile and desktop devices that provides text suggestions. It converts sequential keyboard inputs to the characters in its target language, which is indispensable for Japanese and Chinese users. Due to critical resource constraints and limited network bandwidth of the target devices, applying neural models to input method is not well explored. In this work, we apply a LSTM-based language model to input method and evaluate its performance for both prediction and conversion tasks with Japanese BCCWJ corpus. We articulate the bottleneck to be the slow softmax computation during conversion. To solve the issue, we propose incremental softmax approximation approach, which computes softmax with a selected subset vocabulary and fix the stale probabilities when the vocabulary is updated in future steps. We refer to this method as incremental…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Softmax
