# A Study of Metrics of Distance and Correlation Between Ranked Lists for   Compositionality Detection

**Authors:** Christina Lioma, Niels Dalum Hansen

arXiv: 1703.03640 · 2017-03-13

## TL;DR

This paper introduces an unsupervised method for detecting compositionality in language by representing phrases as ranked lists of term weights and measuring their similarity with various metrics, outperforming supervised methods.

## Contribution

The paper proposes a novel unsupervised approach using ranked list representations and multiple similarity metrics for compositionality detection, showing superior results.

## Key findings

- Unsupervised method outperforms supervised baselines
- Ranked list representations effectively capture compositionality
- Multiple metrics provide robust similarity measurements

## Abstract

Compositionality in language refers to how much the meaning of some phrase can be decomposed into the meaning of its constituents and the way these constituents are combined. Based on the premise that substitution by synonyms is meaning-preserving, compositionality can be approximated as the semantic similarity between a phrase and a version of that phrase where words have been replaced by their synonyms. Different ways of representing such phrases exist (e.g., vectors [1] or language models [2]), and the choice of representation affects the measurement of semantic similarity.   We propose a new compositionality detection method that represents phrases as ranked lists of term weights. Our method approximates the semantic similarity between two ranked list representations using a range of well-known distance and correlation metrics. In contrast to most state-of-the-art approaches in compositionality detection, our method is completely unsupervised. Experiments with a publicly available dataset of 1048 human-annotated phrases shows that, compared to strong supervised baselines, our approach provides superior measurement of compositionality using any of the distance and correlation metrics considered.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1703.03640/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1703.03640/full.md

## References

35 references — full list in the complete paper: https://tomesphere.com/paper/1703.03640/full.md

---
Source: https://tomesphere.com/paper/1703.03640