From Nodes to Networks: Evolving Recurrent Neural Networks
Aditya Rawal, Risto Miikkulainen

TL;DR
This paper introduces a novel evolutionary method for designing complex recurrent neural network nodes, leading to improved language modeling performance by discovering innovative node structures.
Contribution
It proposes a tree-based encoding for evolving recurrent nodes, enabling more effective exploration of new architectures than previous methods.
Findings
Discovered nodes with multiple recurrent paths and memory cells.
Achieved significant improvements on language modeling benchmarks.
Speeded up search process with performance estimation and exploration strategies.
Abstract
Gated recurrent networks such as those composed of Long Short-Term Memory (LSTM) nodes have recently been used to improve state of the art in many sequential processing tasks such as speech recognition and machine translation. However, the basic structure of the LSTM node is essentially the same as when it was first conceived 25 years ago. Recently, evolutionary and reinforcement learning mechanisms have been employed to create new variations of this structure. This paper proposes a new method, evolution of a tree-based encoding of the gated memory nodes, and shows that it makes it possible to explore new variations more effectively than other methods. The method discovers nodes with multiple recurrent paths and multiple memory cells, which lead to significant improvement in the standard language modeling benchmark task. The paper also shows how the search process can be speeded up by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Evolutionary Algorithms and Applications · Metaheuristic Optimization Algorithms Research
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory
