ACER: Automatic Language Model Context Extension via Retrieval

Luyu Gao; Yunyi Zhang; Jamie Callan

arXiv:2410.09141·cs.CL·October 15, 2024

ACER: Automatic Language Model Context Extension via Retrieval

Luyu Gao, Yunyi Zhang, Jamie Callan

PDF

Open Access 3 Reviews

TL;DR

This paper introduces ACER, a method that enhances language models' long-context understanding by using retrieval-based data synthesis and self-tuning of short-context models, outperforming existing models in long-context tasks.

Contribution

ACER proposes an automatic data synthesis pipeline inspired by human retrieval, enabling task-specific long-context capabilities in language models without extensive task-specific data.

Findings

01

ACER outperforms generalist long-context models in retrieval tasks.

02

Synthetic data improves long-context reasoning abilities.

03

Self-tuning of short-context models enhances performance in complex tasks.

Abstract

Long-context modeling is one of the critical capabilities of language AI for digesting and reasoning over complex information pieces. In practice, long-context capabilities are typically built into a pre-trained language model~(LM) through a carefully designed context extension stage, with the goal of producing generalist long-context capabilities. In our preliminary experiments, however, we discovered that the current open-weight generalist long-context models are still lacking in practical long-context processing tasks. While this means perfectly effective long-context modeling demands task-specific data, the cost can be prohibitive. In this paper, we draw inspiration from how humans process a large body of information: a lossy \textbf{retrieval} stage ranks a large set of documents while the reader ends up reading deeply only the top candidates. We build an \textbf{automatic} data…

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 5Confidence 3

Strengths

- The topic of long-context modeling is both compelling and critical, and this paper provides valuable new insights into addressing this task. - The proposed method is well-conceived and alleviates the need for extensive resources for human-annotated data. - The approach demonstrates the potential for practical application, making it a meaningful contribution to long-context modeling research.

Weaknesses

- My primary concern with this paper is the limited evaluation. The experiments provide only a narrow comparison of long-context benchmarks, such as Infinibench [1] and LongBench [2]. Additionally, the paper misses several important approaches mentioned in paper [3] such as self-extend [4] and lm-infinite [5], but lacks a comparative analysis or discussion of these methods, which would strengthen the evaluation. - There is also a lack of case studies and in-depth analysis of the model’s long-co

Reviewer 02Rating 5Confidence 3

Strengths

1. The paper is well-written. 2. The methodology is clear and effective.

Weaknesses

1. The evaluation was conducted solely on long-context RAG tasks, where improvement is natural given the methodology. However, it was not assessed on more general long-context evaluation sets, such as LV-Eval and Needle in a Haystack. 2. The approach seems to be too simplistic and straightforward, lacking innovation and contribution. 3. Experiments are conducted on only one size and one type of language model.

Reviewer 03Rating 5Confidence 4

Strengths

An effective pipeline for automatically generating RAG training data, achieving significant performance improvements even with just an 8B model as a data generator. The scores for downstream RAG tasks are also impressive.

Weaknesses

1. Most comparisons in the experiments are with Long-Context Models. I believe additional RAG strategies should be included for comparison. 2. It would be helpful to see some comparative data statistics, such as a table showing Big Context length and CoT Answer length from Figure 1, as well as length comparisons in the experimental section. 3. Figure 1 needs a higher-resolution version. Using a PDF image is recommended.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling

MethodsSparse Evolutionary Training