Retrieval is Accurate Generation

Bowen Cao; Deng Cai; Leyang Cui; Xuxin Cheng; Wei Bi; Yuexian Zou,; Shuming Shi

arXiv:2402.17532·cs.CL·March 19, 2024·1 cites

Retrieval is Accurate Generation

Bowen Cao, Deng Cai, Leyang Cui, Xuxin Cheng, Wei Bi, Yuexian Zou,, Shuming Shi

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper introduces a retrieval-based text generation method that selects context-aware phrases from documents, significantly improving accuracy and quality over standard language models in knowledge-intensive and open-ended tasks.

Contribution

The paper proposes a novel retrieval-augmented generation approach with a self-reinforcing training oracle initialization, advancing beyond traditional token-based models.

Findings

01

Outperforms standard models on OpenbookQA with 36.27% accuracy

02

Achieves higher MAUVE score of 81.58% in open-ended generation

03

Attains best performance and lowest latency among retrieval baselines

Abstract

Standard language models generate text by selecting tokens from a fixed, finite, and standalone vocabulary. We introduce a novel method that selects context-aware phrases from a collection of supporting documents. One of the most significant challenges for this paradigm shift is determining the training oracles, because a string of text can be segmented in various ways and each segment can be retrieved from numerous possible documents. To address this, we propose to initialize the training oracles using linguistic heuristics and, more importantly, bootstrap the oracles through iterative self-reinforcement. Extensive experiments show that our model not only outperforms standard language models on a variety of knowledge-intensive tasks but also demonstrates improved generation quality in open-ended text generation. For instance, compared to the standard language model counterpart, our…

Peer Reviews

Decision·ICLR 2024 poster

Reviewer 01Rating 6· marginally above the acceptance thresholdConfidence 3

Strengths

The proposed approach seems very interesting and will be useful to the generation community. As I mentioned earlier, the linguistically inspired approach could be very useful in providing meaningful attributions to their sources. Strong results on a variety of benchmarks from Open book qa and open ended generation tasks. == Most of my concerns were adequately addressed in the authors rebuttal. Please include these details in the camera ready version if accepted. I have update my reviews acco

Weaknesses

The authors proposed a very interesting approach but I felt a lot of important details are missing. Please see my questions/comments below. It is unclear whether or not the code will be released from this work. Another weakness of the work I believe is that this approach will not be robust to languages or domains where our syntactic parsing capabilities are limited.

Reviewer 02Rating 8· accept, good paperConfidence 4

Strengths

- The paper is well written and easy to follow. - Their approach on text generation and selecting phrases is novel and introduces an interesting approach to text generation. - The authors study the effectiveness of their approach well and provide comparisons with other approaches. - Their zero-shot results on knowledge intensive tasks is convincing of the effectiveness of their approach.

Weaknesses

- Lack of any human evaluations: Although there are automatic metrics for text generation, there still a need to have humans judge the generation. - The paper does not provide deep insights into the observed results. For example, Section 6, Main Results, related to Table 4, it is not clear why the MAUVE score has such a huge jump for their method, or why finetuning the base model drops this score by a lot.

Reviewer 03Rating 8· accept, good paperConfidence 4

Strengths

- A novel approach for retrieval augmented generation - Holistic evaluation not only by measuring the fluency in open-ended text generation but also by carrying out comprehensive evaluation in a wide range of knowledge-intensive tasks, such as open-domain question answering. - Plug-and-play feature of the phrase index, as a way of adapting to out-of-domain distributions (such as the Medical domain) by simply changing/extending the phrase index with a domain-specific index without any further tr

Weaknesses

Weaknesses: - More implementation details regarding the size of the phrase index, etc would be good to have in the paper. - The work might also benefit from some discussion regarding scalability of the phrase index Minor suggestions: - As Figure 1 is the main overview of the approach proposed in the paper, a more detailed footnote would be appreciated. - Section 6 "Results" wouldn't be better under subsection 5.2.2, as it reflects on results from the Open-Ended Text Generation experiments. - T

Code & Models

Repositories

gmftbygmftby/copyisallyouneed
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI-based Problem Solving and Planning · Speech and dialogue systems · Machine Learning and Algorithms