Novel Aficionados and Doppelg\"angers: a referential task for semantic   representations of individual entities

Andrea Bruera; Aur\'elie Herbelot

arXiv:2104.10270·cs.CL·April 22, 2021·1 cites

Novel Aficionados and Doppelg\"angers: a referential task for semantic representations of individual entities

Andrea Bruera, Aur\'elie Herbelot

PDF

Open Access

TL;DR

This paper investigates why proper names are harder to learn and retrieve than common nouns by analyzing their linguistic distributions using a new referential task, dataset, and models, revealing that individual entities are less distinguishable in distributional semantics.

Contribution

It introduces the Doppelg"anger test and the Novel Aficionados dataset to analyze the semantic distinctions of proper names versus common nouns in distributional models.

Findings

01

Distributional representations of individual entities are less distinguishable than those of common nouns.

02

The linguistic distribution of proper names reflects their cognitive difficulty in learning and retrieval.

03

The results mirror human semantic cognition patterns.

Abstract

In human semantic cognition, proper names (names which refer to individual entities) are harder to learn and retrieve than common nouns. This seems to be the case for machine learning algorithms too, but the linguistic and distributional reasons for this behaviour have not been investigated in depth so far. To tackle this issue, we show that the semantic distinction between proper names and common nouns is reflected in their linguistic distributions by employing an original task for distributional semantics, the Doppelg\"anger test, an extensive set of models, and a new dataset, the Novel Aficionados dataset. The results indicate that the distributional representations of different individual entities are less clearly distinguishable from each other than those of common nouns, an outcome which intriguingly mirrors human cognition.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Language and cultural evolution