BERTese: Learning to Speak to BERT
Adi Haviv, Jonathan Berant, Amir Globerson

TL;DR
This paper introduces BERTese, a method for automatically rewriting queries into optimized paraphrases to improve knowledge extraction from large pre-trained language models, outperforming previous pipelines.
Contribution
The paper proposes a novel automatic query rewriting technique called BERTese that enhances knowledge extraction from language models without complex pipelines.
Findings
BERTese outperforms existing baselines in knowledge extraction tasks.
The method simplifies the pipeline for extracting knowledge from language models.
BERTese offers insights into effective language patterns for knowledge retrieval.
Abstract
Large pre-trained language models have been shown to encode large amounts of world and commonsense knowledge in their parameters, leading to substantial interest in methods for extracting that knowledge. In past work, knowledge was extracted by taking manually-authored queries and gathering paraphrases for them using a separate pipeline. In this work, we propose a method for automatically rewriting queries into "BERTese", a paraphrase query that is directly optimized towards better knowledge extraction. To encourage meaningful rewrites, we add auxiliary loss functions that encourage the query to correspond to actual language tokens. We empirically show our approach outperforms competing baselines, obviating the need for complex pipelines. Moreover, BERTese provides some insight into the type of language that helps language models perform knowledge extraction.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
