Instilling Type Knowledge in Language Models via Multi-Task QA

Shuyang Li; Mukund Sridhar; Chandana Satya Prakash; Jin Cao; Wael; Hamza; Julian McAuley

arXiv:2204.13796·cs.CL·May 2, 2022

Instilling Type Knowledge in Language Models via Multi-Task QA

Shuyang Li, Mukund Sridhar, Chandana Satya Prakash, Jin Cao, Wael, Hamza, Julian McAuley

PDF

Open Access 1 Repo

TL;DR

This paper presents a novel multi-task question-answering approach to embed fine-grained entity type knowledge into language models using a large-scale dataset derived from Wikipedia and Wikidata, improving zero-shot entity understanding.

Contribution

The authors introduce the WikiWiki dataset and a training method that enhances language models' ability to recognize and infer entity types in a fine-grained manner.

Findings

01

State-of-the-art zero-shot dialog state tracking performance

02

Accurate inference of entity types in Wikipedia articles

03

Discovery of new useful entity types by models

Abstract

Understanding human language often necessitates understanding entities and their place in a taxonomy of knowledge -- their types. Previous methods to learn entity types rely on training classifiers on datasets with coarse, noisy, and incomplete labels. We introduce a method to instill fine-grained type knowledge in language models with text-to-text pre-training on type-centric questions leveraging knowledge base documents and knowledge graphs. We create the WikiWiki dataset: entities and passages from 10M Wikipedia articles linked to the Wikidata knowledge graph with 41K types. Models trained on WikiWiki achieve state-of-the-art performance in zero-shot dialog state tracking benchmarks, accurately infer entity types in Wikipedia articles, and can discover new types deemed useful by human judges.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

amazon-research/wikiwiki-dataset
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Wikis in Education and Collaboration

MethodsBalanced Selection