Learning Better Sentence Representation with Syntax Information
Chen Yang (University of Science, Technology of China)

TL;DR
This paper introduces a novel method that combines syntactic information with pre-trained language models to enhance sentence understanding, demonstrating significant improvements in sentence completion and relation extraction tasks.
Contribution
The paper proposes a dependency syntax expansion (DSE) model to effectively integrate syntactic knowledge into pre-trained language models for better semantic understanding.
Findings
Achieved 91.2% accuracy on sentence completion, outperforming baseline by 37.8%.
Attained 75.1% F1 score on relation extraction, showing competitive performance.
Demonstrated the effectiveness of syntax integration across different pre-trained models.
Abstract
Sentence semantic understanding is a key topic in the field of natural language processing. Recently, contextualized word representations derived from pre-trained language models such as ELMO and BERT have shown significant improvements for a wide range of semantic tasks, e.g. question answering, text classification and sentiment analysis. However, how to add external knowledge to further improve the semantic modeling capability of model is worth probing. In this paper, we propose a novel approach to combining syntax information with a pre-trained language model. In order to evaluate the effect of the pre-training model, first, we introduce RNN-based and Transformer-based pre-trained language models; secondly, to better integrate external knowledge, such as syntactic information integrate with the pre-training model, we propose a dependency syntax expansion (DSE) model. For evaluation,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Sentiment Analysis and Opinion Mining
MethodsLinear Layer · Sigmoid Activation · Tanh Activation · Long Short-Term Memory · Weight Decay · Linear Warmup With Linear Decay · Dense Connections · Bidirectional LSTM · Multi-Head Attention · Attention Is All You Need
