TL;DR
This paper introduces Tree-LSTM, a novel extension of traditional LSTMs designed for tree-structured data, which significantly improves semantic representation and performance on language understanding tasks.
Contribution
The paper presents the Tree-LSTM, a new model that generalizes LSTMs to tree structures, capturing syntactic properties of language more effectively.
Findings
Tree-LSTM outperforms existing models on semantic relatedness prediction.
Tree-LSTM achieves state-of-the-art results on sentiment classification.
Tree-LSTM demonstrates superior ability to model hierarchical language structures.
Abstract
Because of their superior ability to preserve sequence information over time, Long Short-Term Memory (LSTM) networks, a type of recurrent neural network with a more complex computational unit, have obtained strong results on a variety of sequence modeling tasks. The only underlying LSTM structure that has been explored so far is a linear chain. However, natural language exhibits syntactic properties that would naturally combine words to phrases. We introduce the Tree-LSTM, a generalization of LSTMs to tree-structured network topologies. Tree-LSTMs outperform all existing systems and strong LSTM baselines on two tasks: predicting the semantic relatedness of two sentences (SemEval 2014, Task 1) and sentiment classification (Stanford Sentiment Treebank).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory
