Autoregressive Entity Retrieval

Nicola De Cao; Gautier Izacard; Sebastian Riedel; Fabio Petroni

arXiv:2010.00904·cs.CL·March 25, 2021·200 cites

Autoregressive Entity Retrieval

Nicola De Cao, Gautier Izacard, Sebastian Riedel, Fabio Petroni

PDF

Open Access 2 Repos 3 Models 1 Video

TL;DR

GENRE introduces an autoregressive approach for entity retrieval that generates entity names token-by-token, reducing memory usage and improving relation modeling, leading to state-of-the-art results across multiple datasets.

Contribution

This work presents GENRE, the first autoregressive entity retrieval system that directly generates entity names, addressing key limitations of existing dense representation methods.

Findings

01

Achieves state-of-the-art or competitive results on 20+ datasets.

02

Significantly reduces memory footprint compared to existing methods.

03

Allows easy addition of new entities by specifying their names.

Abstract

Entities are at the center of how we represent and aggregate knowledge. For instance, Encyclopedias such as Wikipedia are structured by entities (e.g., one per Wikipedia article). The ability to retrieve such entities given a query is fundamental for knowledge-intensive tasks such as entity linking and open-domain question answering. Current approaches can be understood as classifiers among atomic labels, one for each entity. Their weight vectors are dense entity representations produced by encoding entity meta information such as their descriptions. This approach has several shortcomings: (i) context and entity affinity is mainly captured through a vector dot product, potentially missing fine-grained interactions; (ii) a large memory footprint is needed to store dense representations when considering large entity sets; (iii) an appropriately hard set of negative data has to be…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

Autoregressive Entity Retrieval· slideslive

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies

MethodsSoftmax