Neural Speed Reading via Skim-RNN
Minjoon Seo, Sewon Min, Ali Farhadi, Hannaneh Hajishirzi

TL;DR
Skim-RNN is a dynamic recurrent neural network that selectively updates its hidden state for less important tokens, significantly reducing computational cost while maintaining accuracy across multiple NLP tasks.
Contribution
We introduce Skim-RNN, a novel RNN variant that adaptively skips updates for unimportant inputs, enabling faster inference without sacrificing performance.
Findings
Achieves lower computational cost compared to standard RNNs.
Maintains accuracy across five NLP tasks.
Allows dynamic control of speed-accuracy trade-off during inference.
Abstract
Inspired by the principles of speed reading, we introduce Skim-RNN, a recurrent neural network (RNN) that dynamically decides to update only a small fraction of the hidden state for relatively unimportant input tokens. Skim-RNN gives computational advantage over an RNN that always updates the entire hidden state. Skim-RNN uses the same input and output interfaces as a standard RNN and can be easily used instead of RNNs in existing models. In our experiments, we show that Skim-RNN can achieve significantly reduced computational cost without losing accuracy compared to standard RNNs across five different natural language tasks. In addition, we demonstrate that the trade-off between accuracy and speed of Skim-RNN can be dynamically controlled during inference time in a stable manner. Our analysis also shows that Skim-RNN running on a single CPU offers lower latency compared to standard…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Neural Network Applications · Ferroelectric and Negative Capacitance Devices
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
