On Tree-Based Neural Sentence Modeling
Haoyue Shi, Hao Zhou, Jiaze Chen, Lei Li

TL;DR
This paper investigates the impact of explicit syntactic structures in tree-based neural sentence models, revealing that trivial trees without syntactic info perform comparably or better on various tasks, challenging the assumed importance of syntax guidance.
Contribution
The study demonstrates that explicit syntactic trees are not essential for effective neural sentence modeling, providing insights into alternative tree structures and their influence on performance.
Findings
Trivial trees perform as well or better than syntactic trees on multiple tasks.
Explicit syntax guidance may not be the main factor in tree-based model success.
Tree modeling benefits when crucial words are closer to the final representation.
Abstract
Neural networks with tree-based sentence encoders have shown better results on many downstream tasks. Most of existing tree-based encoders adopt syntactic parsing trees as the explicit structure prior. To study the effectiveness of different tree structures, we replace the parsing trees with trivial trees (i.e., binary balanced tree, left-branching tree and right-branching tree) in the encoders. Though trivial trees contain no syntactic information, those encoders get competitive or even better results on all of the ten downstream tasks we investigated. This surprising result indicates that explicit syntax guidance may not be the main contributor to the superior performances of tree-based neural sentence modeling. Further analysis show that tree modeling gives better results when crucial words are closer to the final representation. Additional experiments give more clues on how to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
