# Semantic Similarity from Natural Language and Ontology Analysis

**Authors:** S\'ebastien Harispe, Sylvie Ranwez, Stefan Janaqi, Jacky Montmain

arXiv: 1704.05295 · 2017-04-19

## TL;DR

This paper explores methods for measuring semantic similarity between language units and knowledge base concepts, combining NLP techniques and ontology analysis to improve AI's understanding of meaning.

## Contribution

It provides a comprehensive overview of state-of-the-art semantic similarity measures, integrating NLP and ontology-based approaches for better semantic comparison.

## Key findings

- Two main approaches: NLP-based and ontology-based similarity measures.
- Enhanced understanding of semantic similarity estimation methods.
- Guidance for researchers and novices in semantic measures.

## Abstract

Artificial Intelligence federates numerous scientific fields in the aim of developing machines able to assist human operators performing complex treatments -- most of which demand high cognitive skills (e.g. learning or decision processes). Central to this quest is to give machines the ability to estimate the likeness or similarity between things in the way human beings estimate the similarity between stimuli.   In this context, this book focuses on semantic measures: approaches designed for comparing semantic entities such as units of language, e.g. words, sentences, or concepts and instances defined into knowledge bases. The aim of these measures is to assess the similarity or relatedness of such semantic entities by taking into account their semantics, i.e. their meaning -- intuitively, the words tea and coffee, which both refer to stimulating beverage, will be estimated to be more semantically similar than the words toffee (confection) and coffee, despite that the last pair has a higher syntactic similarity. The two state-of-the-art approaches for estimating and quantifying semantic similarities/relatedness of semantic entities are presented in detail: the first one relies on corpora analysis and is based on Natural Language Processing techniques and semantic models while the second is based on more or less formal, computer-readable and workable forms of knowledge such as semantic networks, thesaurus or ontologies. (...) Beyond a simple inventory and categorization of existing measures, the aim of this monograph is to convey novices as well as researchers of these domains towards a better understanding of semantic similarity estimation and more generally semantic measures.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1704.05295/full.md

---
Source: https://tomesphere.com/paper/1704.05295