Effects of High-Order Co-occurrences on Word Semantic Similarities

Beno\^it Lemaire (TIMC); Guy Denhi\`ere (LPC)

arXiv:0804.0143·cs.CL·December 18, 2008·22 cites

Effects of High-Order Co-occurrences on Word Semantic Similarities

Beno\^it Lemaire (TIMC), Guy Denhi\`ere (LPC)

PDF

Open Access

TL;DR

This paper presents a computational model demonstrating how high-order co-occurrences influence word semantic similarities, revealing biases in traditional co-occurrence-based similarity measures.

Contribution

It introduces a model that simulates the effects of various co-occurrence types on word similarity, highlighting limitations of frequency-based measures.

Findings

01

Similarity increases with direct co-occurrence

02

Similarity decreases when only one word occurs without the other

03

High-order co-occurrences slightly increase similarity

Abstract

A computational model of the construction of word meaning through exposure to texts is built in order to simulate the effects of co-occurrence values on word semantic similarities, paragraph by paragraph. Semantic similarity is here viewed as association. It turns out that the similarity between two words W1 and W2 strongly increases with a co-occurrence, decreases with the occurrence of W1 without W2 or W2 without W1, and slightly increases with high-order co-occurrences. Therefore, operationalizing similarity as a frequency of co-occurrence probably introduces a bias: first, there are cases in which there is similarity without co-occurrence and, second, the frequency of co-occurrence overestimates similarity.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques