Shortcut Sequence Tagging
Huijia Wu, Jiajun Zhang, Chengqing Zong

TL;DR
This paper introduces shortcut blocks combining gating and shortcuts to simplify deep RNNs, making training easier and improving performance on sequence tagging tasks like CCGbank supertagging.
Contribution
The paper proposes a novel shortcut block architecture that simplifies stacked RNNs and enhances training efficiency and accuracy.
Findings
Achieved 6% relative improvement on CCGbank supertagging.
Demonstrated easier training and better generalization.
Obtained comparable results on POS tagging.
Abstract
Deep stacked RNNs are usually hard to train. Adding shortcut connections across different layers is a common way to ease the training of stacked networks. However, extra shortcuts make the recurrent step more complicated. To simply the stacked architecture, we propose a framework called shortcut block, which is a marriage of the gating mechanism and shortcuts, while discarding the self-connected part in LSTM cell. We present extensive empirical experiments showing that this design makes training easy and improves generalization. We propose various shortcut block topologies and compositions to explore its effectiveness. Based on this architecture, we obtain a 6% relatively improvement over the state-of-the-art on CCGbank supertagging dataset. We also get comparable results on POS tagging task.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory
