Concept Identification of Directly and Indirectly Related Mentions Referring to Groups of Persons
Anastasia Zhukova, Felix Hamborg, Karsten Donnay, Bela Gipp

TL;DR
This paper presents an unsupervised clustering method to identify groups of persons acting as non-named entity actors in texts, effectively grouping related mentions with diverse wording while maintaining geopolitical distinctions.
Contribution
It introduces the first unsupervised approach for clustering mentions of groups of persons as actors, improving semantic grouping in text analysis tasks.
Findings
Successfully clusters related mentions with diverse wording.
Maintains separation of geopolitical entities.
Outperforms baseline in grouping related actor mentions.
Abstract
Unsupervised concept identification through clustering, i.e., identification of semantically related words and phrases, is a common approach to identify contextual primitives employed in various use cases, e.g., text dimension reduction, i.e., replace words with the concepts to reduce the vocabulary size, summarization, and named entity resolution. We demonstrate the first results of an unsupervised approach for the identification of groups of persons as actors extracted from a set of related articles. Specifically, the approach clusters mentions of groups of persons that act as non-named entity actors in the texts, e.g., "migrant families" = "asylum-seekers." Compared to our baseline, the approach keeps the mentions of the geopolitical entities separated, e.g., "Iran leaders" != "European leaders," and clusters (in)directly related mentions with diverse wording, e.g., "American…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
