Thick-Net: Parallel Network Structure for Sequential Modeling

Yu-Xuan Li; Jin-Yuan Liu; Liang Li; Xiang Guan

arXiv:1911.08074·cs.LG·November 20, 2019

Thick-Net: Parallel Network Structure for Sequential Modeling

Yu-Xuan Li, Jin-Yuan Liu, Liang Li, Xiang Guan

PDF

Open Access

TL;DR

Thick-Net introduces a parallel, multi-parameter network structure for sequence modeling that enhances accuracy, convergence speed, and generalization while reducing overfitting and easing optimization.

Contribution

The paper proposes Thick-Net, a novel parallel network architecture that expands model 'thickness' to improve sequence learning performance.

Findings

01

Achieves better accuracy and faster convergence.

02

Demonstrates improved generalization across tasks.

03

Effectively reduces overfitting and simplifies optimization.

Abstract

Recurrent neural networks have been widely used in sequence learning tasks. In previous studies, the performance of the model has always been improved by either wider or deeper structures. However, the former becomes more prone to overfitting, while the latter is difficult to optimize. In this paper, we propose a simple new model named Thick-Net, by expanding the network from another dimension: thickness. Multiple parallel values are obtained via more sets of parameters in each hidden state, and the maximum value is selected as the final output among parallel intermediate outputs. Notably, Thick-Net can efficiently avoid overfitting, and is easier to optimize than the vanilla structures due to the large dropout affiliated with it. Our model is evaluated on four sequential tasks including adding problem, permuted sequential MNIST, text classification and language modeling. The results of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Neural Networks and Applications

MethodsDropout