Effects of High-Order Co-occurrences on Word Semantic Similarities
Beno\^it Lemaire (TIMC), Guy Denhi\`ere (LPC)

TL;DR
This paper presents a computational model demonstrating how high-order co-occurrences influence word semantic similarities, revealing biases in traditional co-occurrence-based similarity measures.
Contribution
It introduces a model that simulates the effects of various co-occurrence types on word similarity, highlighting limitations of frequency-based measures.
Findings
Similarity increases with direct co-occurrence
Similarity decreases when only one word occurs without the other
High-order co-occurrences slightly increase similarity
Abstract
A computational model of the construction of word meaning through exposure to texts is built in order to simulate the effects of co-occurrence values on word semantic similarities, paragraph by paragraph. Semantic similarity is here viewed as association. It turns out that the similarity between two words W1 and W2 strongly increases with a co-occurrence, decreases with the occurrence of W1 without W2 or W2 without W1, and slightly increases with high-order co-occurrences. Therefore, operationalizing similarity as a frequency of co-occurrence probably introduces a bias: first, there are cases in which there is similarity without co-occurrence and, second, the frequency of co-occurrence overestimates similarity.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques
