TL;DR
This paper investigates the computational capabilities of finite precision RNNs, demonstrating that certain variants like LSTMs and ReLU-RNNs can perform counting, unlike others, under practical constraints.
Contribution
It characterizes the computational power of finite precision RNNs and shows that LSTMs and ReLU-RNNs are strictly more powerful than other variants in this setting.
Findings
LSTMs and ReLU-RNNs can implement counting mechanisms
Empirical evidence shows LSTMs effectively learn counting behavior
Different RNN variants have varying computational strengths under finite precision
Abstract
While Recurrent Neural Networks (RNNs) are famously known to be Turing complete, this relies on infinite precision in the states and unbounded computation time. We consider the case of RNNs with finite precision whose computation time is linear in the input length. Under these limitations, we show that different RNN variants have different computational power. In particular, we show that the LSTM and the Elman-RNN with ReLU activation are strictly stronger than the RNN with a squashing activation and the GRU. This is achieved because LSTMs and ReLU-RNNs can easily implement counting behavior. We show empirically that the LSTM does indeed learn to effectively use the counting mechanism.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSigmoid Activation · Tanh Activation · Gated Recurrent Unit · *Communicated@Fast*How Do I Communicate to Expedia? · Long Short-Term Memory
