TL;DR
This paper introduces a graph convolution-based probe model to interpret knowledge integration in language models, revealing limited factual knowledge incorporation and differences across models, suggesting the need for fundamental advances.
Contribution
It proposes the Graph Convolution Simulator (GCS) for interpreting knowledge integration in language models and analyzes existing models to reveal their knowledge limitations.
Findings
GCS effectively interprets knowledge integration process.
ERNIE and K-Adapter incorporate limited factual knowledge.
Increasing KI corpus size alone is insufficient for better knowledge integration.
Abstract
Pretrained language models (LMs) do not capture factual knowledge very well. This has led to the development of a number of knowledge integration (KI) methods which aim to incorporate external knowledge into pretrained LMs. Even though KI methods show some performance gains over vanilla LMs, the inner-workings of these methods are not well-understood. For instance, it is unclear how and what kind of knowledge is effectively integrated into these models and if such integration may lead to catastrophic forgetting of already learned knowledge. This paper revisits the KI process in these models with an information-theoretic view and shows that KI can be interpreted using a graph convolution operation. We propose a probe model called \textit{Graph Convolution Simulator} (GCS) for interpreting knowledge-enhanced LMs and exposing what kind of knowledge is integrated into these models. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsERNIE · Convolution
