Sentence Modeling via Multiple Word Embeddings and Multi-level   Comparison for Semantic Textual Similarity

Huy Nguyen Tien; Minh Nguyen Le; Yamasaki Tomohiro; Izuha Tatsuya

arXiv:1805.07882·cs.CL·May 22, 2018·6 cites

Sentence Modeling via Multiple Word Embeddings and Multi-level Comparison for Semantic Textual Similarity

Huy Nguyen Tien, Minh Nguyen Le, Yamasaki Tomohiro, Izuha Tatsuya

PDF

Open Access

TL;DR

This paper introduces M-MaxLSTM-CNN, a novel model that employs multiple word embeddings and multi-level comparison to improve semantic textual similarity assessment, outperforming state-of-the-art methods without relying on handcrafted features or uniform embedding dimensions.

Contribution

The paper proposes a new sentence modeling approach using multiple word embeddings and multi-level comparison, enhancing semantic similarity evaluation without handcrafted features.

Findings

01

Outperforms state-of-the-art methods on STS Benchmark and SICK datasets.

02

Does not require handcrafted features or uniform embedding dimensions.

03

Shows strong performance in paraphrase detection and textual entailment tasks.

Abstract

Different word embedding models capture different aspects of linguistic properties. This inspired us to propose a model (M-MaxLSTM-CNN) for employing multiple sets of word embeddings for evaluating sentence similarity/relation. Representing each word by multiple word embeddings, the MaxLSTM-CNN encoder generates a novel sentence embedding. We then learn the similarity/relation between our sentence embeddings via Multi-level comparison. Our method M-MaxLSTM-CNN consistently shows strong performances in several tasks (i.e., measure textual similarity, identify paraphrase, recognize textual entailment). According to the experimental results on STS Benchmark dataset and SICK dataset from SemEval, M-MaxLSTM-CNN outperforms the state-of-the-art methods for textual similarity tasks. Our model does not use hand-crafted features (e.g., alignment features, Ngram overlaps, dependency features) as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification