Knowledge Rumination for Pre-trained Language Models

Yunzhi Yao; Peng Wang; Shengyu Mao; Chuanqi Tan; Fei Huang; Huajun; Chen; Ningyu Zhang

arXiv:2305.08732·cs.CL·October 12, 2023·2 cites

Knowledge Rumination for Pre-trained Language Models

Yunzhi Yao, Peng Wang, Shengyu Mao, Chuanqi Tan, Fei Huang, Huajun, Chen, Ningyu Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces Knowledge Rumination, a simple prompt-based method that helps pre-trained language models better utilize their internal knowledge for knowledge-intensive NLP tasks, improving performance without external retrieval.

Contribution

It proposes a novel prompt-based paradigm enabling PLMs to review and consolidate their latent knowledge, enhancing their effectiveness on various NLP benchmarks.

Findings

01

Improved performance on six commonsense reasoning tasks

02

Enhanced results on GLUE benchmarks

03

Effective across multiple language models

Abstract

Previous studies have revealed that vanilla pre-trained language models (PLMs) lack the capacity to handle knowledge-intensive NLP tasks alone; thus, several works have attempted to integrate external knowledge into PLMs. However, despite the promising outcome, we empirically observe that PLMs may have already encoded rich knowledge in their pre-trained parameters but fail to fully utilize them when applying them to knowledge-intensive tasks. In this paper, we propose a new paradigm dubbed Knowledge Rumination to help the pre-trained language model utilize that related latent knowledge without retrieving it from the external corpus. By simply adding a prompt like "As far as I know" to the PLMs, we try to review related latent knowledge and inject them back into the model for knowledge consolidation. We apply the proposed knowledge rumination to various language models, including…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zjunlp/knowledge-rumination
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

Methods{Dispute@FaQ-s}How to file a dispute with Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Multi-Head Attention · Attention Is All You Need · Cosine Annealing · WordPiece · Linear Warmup With Linear Decay · Byte Pair Encoding · Dropout · Linear Layer