Autoregressive Entity Retrieval
Nicola De Cao, Gautier Izacard, Sebastian Riedel, Fabio Petroni

TL;DR
GENRE introduces an autoregressive approach for entity retrieval that generates entity names token-by-token, reducing memory usage and improving relation modeling, leading to state-of-the-art results across multiple datasets.
Contribution
This work presents GENRE, the first autoregressive entity retrieval system that directly generates entity names, addressing key limitations of existing dense representation methods.
Findings
Achieves state-of-the-art or competitive results on 20+ datasets.
Significantly reduces memory footprint compared to existing methods.
Allows easy addition of new entities by specifying their names.
Abstract
Entities are at the center of how we represent and aggregate knowledge. For instance, Encyclopedias such as Wikipedia are structured by entities (e.g., one per Wikipedia article). The ability to retrieve such entities given a query is fundamental for knowledge-intensive tasks such as entity linking and open-domain question answering. Current approaches can be understood as classifiers among atomic labels, one for each entity. Their weight vectors are dense entity representations produced by encoding entity meta information such as their descriptions. This approach has several shortcomings: (i) context and entity affinity is mainly captured through a vector dot product, potentially missing fine-grained interactions; (ii) a large memory footprint is needed to store dense representations when considering large entity sets; (iii) an appropriately hard set of negative data has to be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
MethodsSoftmax
