Architectural Complexity Measures of Recurrent Neural Networks
Saizheng Zhang, Yuhuai Wu, Tong Che, Zhouhan Lin, Roland Memisevic,, Ruslan Salakhutdinov, Yoshua Bengio

TL;DR
This paper introduces a graph-theoretic framework and three novel complexity measures for recurrent neural networks, providing insights into their architecture and how it affects performance on long-term dependency tasks.
Contribution
It presents a rigorous framework for analyzing RNN architectures and proposes three new complexity measures, with empirical evidence of their impact on performance.
Findings
Larger recurrent and feedforward depths can improve RNN performance.
Increasing the recurrent skip coefficient enhances long-term dependency handling.
The measures are rigorously defined and computationally feasible.
Abstract
In this paper, we systematically analyze the connecting architectures of recurrent neural networks (RNNs). Our main contribution is twofold: first, we present a rigorous graph-theoretic framework describing the connecting architectures of RNNs in general. Second, we propose three architecture complexity measures of RNNs: (a) the recurrent depth, which captures the RNN's over-time nonlinear complexity, (b) the feedforward depth, which captures the local input-output nonlinearity (similar to the "depth" in feedforward neural networks (FNNs)), and (c) the recurrent skip coefficient which captures how rapidly the information propagates over time. We rigorously prove each measure's existence and computability. Our experimental results show that RNNs might benefit from larger recurrent depth and feedforward depth. We further demonstrate that increasing recurrent skip coefficient offers…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Neural Networks and Applications · Ferroelectric and Negative Capacitance Devices
