Commonsense Knowledge Mining from Pretrained Models
Joshua Feldman, Joe Davison, Alexander M. Rush

TL;DR
This paper introduces an unsupervised method for mining commonsense knowledge from pre-trained language models by transforming triples into masked sentences, which outperforms supervised methods on novel data.
Contribution
It proposes a novel approach that leverages pre-trained models without fine-tuning, improving generalization in commonsense knowledge extraction from new sources.
Findings
Outperforms supervised models on unseen data
Does not require fine-tuning of the language model
Generalizes better to new sources of commonsense knowledge
Abstract
Inferring commonsense knowledge is a key challenge in natural language processing, but due to the sparsity of training data, previous work has shown that supervised methods for commonsense knowledge mining underperform when evaluated on novel data. In this work, we develop a method for generating commonsense knowledge using a large, pre-trained bidirectional language model. By transforming relational triples into masked sentences, we can use this model to rank a triple's validity by the estimated pointwise mutual information between the two entities. Since we do not update the weights of the bidirectional model, our approach is not biased by the coverage of any one commonsense knowledge base. Though this method performs worse on a test set than models explicitly trained on a corresponding training set, it outperforms these methods when mining commonsense knowledge from new sources,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Semantic Web and Ontologies
