Prototypical Recurrent Unit
Dingkun Long, Richong Zhang, Yongyi Mao

TL;DR
This paper introduces the prototypical recurrent unit (PRU), a simplified yet effective alternative to LSTM/GRU, enabling easier analysis of recurrent networks while maintaining comparable performance and exploring their memorization capabilities.
Contribution
The paper proposes the PRU, a minimalistic recurrent unit that captures key features of LSTM/GRU, facilitating theoretical analysis and understanding of recurrent neural networks.
Findings
PRU achieves performance comparable to LSTM and GRU.
Memorization ability depends on information quantity and state space size.
PRU provides a simpler model for studying long-term memory in recurrent networks.
Abstract
Despite the great successes of deep learning, the effectiveness of deep neural networks has not been understood at any theoretical depth. This work is motivated by the thrust of developing a deeper understanding of recurrent neural networks, particularly LSTM/GRU-like networks. As the highly complex structure of the recurrent unit in LSTM and GRU networks makes them difficult to analyze, our methodology in this research theme is to construct an alternative recurrent unit that is as simple as possible and yet also captures the key components of LSTM/GRU recurrent units. Such a unit can then be used for the study of recurrent networks and its structural simplicity may allow easier analysis. Towards that goal, we take a system-theoretic perspective to design a new recurrent unit, which we call the prototypical recurrent unit (PRU). Not only having minimal complexity, PRU is demonstrated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Topic Modeling · Parallel Computing and Optimization Techniques
MethodsSigmoid Activation · Tanh Activation · Gated Recurrent Unit · Long Short-Term Memory
