Architectural Complexity Measures of Recurrent Neural Networks

Saizheng Zhang; Yuhuai Wu; Tong Che; Zhouhan Lin; Roland Memisevic,; Ruslan Salakhutdinov; Yoshua Bengio

arXiv:1602.08210·cs.LG·November 15, 2016·111 cites

Architectural Complexity Measures of Recurrent Neural Networks

Saizheng Zhang, Yuhuai Wu, Tong Che, Zhouhan Lin, Roland Memisevic,, Ruslan Salakhutdinov, Yoshua Bengio

PDF

Open Access

TL;DR

This paper introduces a graph-theoretic framework and three novel complexity measures for recurrent neural networks, providing insights into their architecture and how it affects performance on long-term dependency tasks.

Contribution

It presents a rigorous framework for analyzing RNN architectures and proposes three new complexity measures, with empirical evidence of their impact on performance.

Findings

01

Larger recurrent and feedforward depths can improve RNN performance.

02

Increasing the recurrent skip coefficient enhances long-term dependency handling.

03

The measures are rigorously defined and computationally feasible.

Abstract

In this paper, we systematically analyze the connecting architectures of recurrent neural networks (RNNs). Our main contribution is twofold: first, we present a rigorous graph-theoretic framework describing the connecting architectures of RNNs in general. Second, we propose three architecture complexity measures of RNNs: (a) the recurrent depth, which captures the RNN's over-time nonlinear complexity, (b) the feedforward depth, which captures the local input-output nonlinearity (similar to the "depth" in feedforward neural networks (FNNs)), and (c) the recurrent skip coefficient which captures how rapidly the information propagates over time. We rigorously prove each measure's existence and computability. Our experimental results show that RNNs might benefit from larger recurrent depth and feedforward depth. We further demonstrate that increasing recurrent skip coefficient offers…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Neural Networks and Applications · Ferroelectric and Negative Capacitance Devices