Network analysis of named entity co-occurrences in written texts

Diego R. Amancio

arXiv:1509.05281·cs.CL·July 7, 2016

Network analysis of named entity co-occurrences in written texts

Diego R. Amancio

PDF

TL;DR

This paper introduces a novel network-based approach to analyze named entity co-occurrences in texts, revealing topological features and improving pattern recognition over traditional word adjacency models.

Contribution

The study proposes a new co-occurrence network model based on null models and demonstrates its effectiveness in identifying references and characterizing texts.

Findings

01

The model exhibits small-world properties with high clustering.

02

It outperforms traditional adjacency networks in reference identification.

03

Topological analysis enhances understanding of text structure.

Abstract

The use of methods borrowed from statistics and physics to analyze written texts has allowed the discovery of unprecedent patterns of human behavior and cognition by establishing links between models features and language structure. While current models have been useful to unveil patterns via analysis of syntactical and semantical networks, only a few works have probed the relevance of investigating the structure arising from the relationship between relevant entities such as characters, locations and organizations. In this study, we represent entities appearing in the same context as a co-occurrence network, where links are established according to a null model based on random, shuffled texts. Computational simulations performed in novels revealed that the proposed model displays interesting topological features, such as the small world feature, characterized by high values of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.