Leveraging Grammar Induction for Language Understanding and Generation
Jushi Kai, Shengyuan Hou, Yusheng Huang, Zhouhan Lin

TL;DR
This paper presents an unsupervised grammar induction approach that enhances language understanding and generation by integrating induced syntactic structures into Transformer models, improving performance on translation and understanding tasks.
Contribution
Introduces a novel unsupervised grammar induction method that incorporates syntactic structures into Transformer models without needing additional annotations.
Findings
Outperforms original Transformer on multiple tasks
Effective in both from-scratch and pre-trained scenarios
Highlights the benefit of explicit grammatical modeling
Abstract
Grammar induction has made significant progress in recent years. However, it is not clear how the application of induced grammar could enhance practical performance in downstream tasks. In this work, we introduce an unsupervised grammar induction method for language understanding and generation. We construct a grammar parser to induce constituency structures and dependency relations, which is simultaneously trained on downstream tasks without additional syntax annotations. The induced grammar features are subsequently incorporated into Transformer as a syntactic mask to guide self-attention. We evaluate and apply our method to multiple machine translation tasks and natural language understanding tasks. Our method demonstrates superior performance compared to the original Transformer and other models enhanced with external parsers. Experimental results indicate that our method is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
MethodsAttention Is All You Need · Dense Connections · Adam · Linear Layer · Residual Connection · Position-Wise Feed-Forward Layer · Label Smoothing · Dropout · Byte Pair Encoding · Absolute Position Encodings
