Deep RNNs Encode Soft Hierarchical Syntax
Terra Blevins, Omer Levy, Luke Zettlemoyer

TL;DR
Deep RNNs inherently learn hierarchical syntactic structures from various tasks without explicit syntax supervision, with deeper layers capturing higher-level syntactic information.
Contribution
This study demonstrates that deep RNNs develop a soft hierarchical syntax representation across different NLP tasks without explicit syntactic training.
Findings
Network depth correlates with syntactic depth in representations.
Models encode significant syntactic information without explicit supervision.
The hierarchical encoding is robust across multiple NLP tasks.
Abstract
We present a set of experiments to demonstrate that deep recurrent neural networks (RNNs) learn internal representations that capture soft hierarchical notions of syntax from highly varied supervision. We consider four syntax tasks at different depths of the parse tree; for each word, we predict its part of speech as well as the first (parent), second (grandparent) and third level (great-grandparent) constituent labels that appear above it. These predictions are made from representations produced at different depths in networks that are pretrained with one of four objectives: dependency parsing, semantic role labeling, machine translation, or language modeling. In every case, we find a correspondence between network depth and syntactic depth, suggesting that a soft syntactic hierarchy emerges. This effect is robust across all conditions, indicating that the models encode significant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
