Monitoring Term Drift Based on Semantic Consistency in an Evolving   Vector Field

Peter Wittek; S\'andor Dar\'anyi; Efstratios Kontopoulos; Theodoros; Moysiadis; Ioannis Kompatsiaris

arXiv:1502.01753·cs.CL·November 22, 2016

Monitoring Term Drift Based on Semantic Consistency in an Evolving Vector Field

Peter Wittek, S\'andor Dar\'anyi, Efstratios Kontopoulos, Theodoros, Moysiadis, Ioannis Kompatsiaris

PDF

1 Repo

TL;DR

This paper introduces a scalable method for monitoring semantic drift in large, evolving text corpora using a combination of random indexing and self-organizing maps, validated on Amazon reviews.

Contribution

It proposes a novel approach combining random indexing and ESOM to track semantic changes over time in large-scale language data.

Findings

01

High semantic consistency within clusters at 0.05 significance level

02

Semantic consistency decreases over time, but not significantly

03

Method is scalable and philosophically interpretable

Abstract

Based on the Aristotelian concept of potentiality vs. actuality allowing for the study of energy and dynamics in language, we propose a field approach to lexical analysis. Falling back on the distributional hypothesis to statistically model word meaning, we used evolving fields as a metaphor to express time-dependent changes in a vector space model by a combination of random indexing and evolving self-organizing maps (ESOM). To monitor semantic drifts within the observation period, an experiment was carried out on the term space of a collection of 12.8 million Amazon book reviews. For evaluation, the semantic consistency of ESOM term clusters was compared with their respective neighbourhoods in WordNet, and contrasted with distances among term vectors by random indexing. We found that at 0.05 level of significance, the terms in the clusters showed a high level of semantic consistency.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

peterwittek/concept_drifts
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.