A Comprehensive Comparative Study of Word and Sentence Similarity   Measures

Issa Atoum; Ahmed Otoom; Narayanan Kulathuramaiyer

arXiv:1610.04533·cs.IR·October 17, 2016

A Comprehensive Comparative Study of Word and Sentence Similarity Measures

Issa Atoum, Ahmed Otoom, Narayanan Kulathuramaiyer

PDF

TL;DR

This paper reviews and compares various word and sentence similarity measures, finding that hybrid semantic approaches outperform knowledge-based and corpus-based methods across benchmark datasets.

Contribution

It provides a comprehensive comparison of similarity measures and highlights the superior performance of hybrid semantic methods in NLP tasks.

Findings

01

Hybrid semantic measures outperform other methods

02

Knowledge-based measures are less effective

03

Corpus-based measures show moderate performance

Abstract

Sentence similarity is considered the basis of many natural language tasks such as information retrieval, question answering and text summarization. The semantic meaning between compared text fragments is based on the words semantic features and their relationships. This article reviews a set of word and sentence similarity measures and compares them on benchmark datasets. On the studied datasets, results showed that hybrid semantic measures perform better than both knowledge and corpus based measures.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.