Show, Write, and Retrieve: Entity-aware Article Generation and Retrieval
Zhongping Zhang, Yiwen Gu, Bryan A. Plummer

TL;DR
The paper introduces ENGINE, a framework that explicitly incorporates named entities into language models to improve article generation and retrieval, especially for news stories referencing real-world entities.
Contribution
ENGINE is the first framework to explicitly integrate named entities into language models for improved article generation and retrieval tasks.
Findings
Boosted article generation perplexity by 4-5 points.
Improved article retrieval recall@1 by 3-4%.
Effective on three public datasets: GoodNews, VisualNews, WikiText.
Abstract
Article comprehension is an important challenge in natural language processing with many applications such as article generation or image-to-article retrieval. Prior work typically encodes all tokens in articles uniformly using pretrained language models. However, in many applications, such as understanding news stories, these articles are based on real-world events and may reference many named entities that are difficult to accurately recognize and predict by language models. To address this challenge, we propose an ENtity-aware article GeneratIoN and rEtrieval (ENGINE) framework, to explicitly incorporate named entities into language models. ENGINE has two main components: a named-entity extraction module to extract named entities from both metadata and embedded images associated with articles, and an entity-aware mechanism that enhances the model's ability to recognize and predict…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsBalanced Selection
