Fewer is More: Boosting LLM Reasoning with Reinforced Context Pruning

Xijie Huang; Li Lyna Zhang; Kwang-Ting Cheng; Fan Yang; Mao Yang

arXiv:2312.08901·cs.CL·February 16, 2024·2 cites

Fewer is More: Boosting LLM Reasoning with Reinforced Context Pruning

Xijie Huang, Li Lyna Zhang, Kwang-Ting Cheng, Fan Yang, Mao Yang

PDF

Open Access

TL;DR

CoT-Influx enhances LLM mathematical reasoning by selectively pruning and maximizing effective Chain-of-Thought examples within the context window, leading to significant performance improvements without fine-tuning.

Contribution

Introduces a coarse-to-fine pruner and reinforcement learning approach to optimize CoT prompts, enabling more effective reasoning in LLMs without additional training.

Findings

01

Outperforms baseline prompting methods across multiple LLMs and datasets.

02

Enables larger context windows, doubling the number of CoT examples.

03

LLaMA2-70B with CoT-Influx surpasses GPT-3.5 and larger LLMs on GSM8K.

Abstract

Large Language Models (LLMs) have shown impressive capabilities, yet they still struggle with math reasoning. In this work, we propose CoT-Influx, a novel approach that pushes the boundary of few-shot Chain-of-Thoughts (CoT) learning to improve LLM mathematical reasoning. Motivated by the observation that adding more concise CoT examples in the prompt can improve LLM reasoning performance, CoT-Influx employs a coarse-to-fine pruner to maximize the input of effective and concise CoT examples. The pruner first selects as many crucial CoT examples as possible and then prunes unimportant tokens to fit the context window. A math reasoning dataset with diverse difficulty levels and reasoning steps is used to train the pruner, along with a math-specialized reinforcement learning approach. As a result, by enabling more CoT examples with double the context window size in tokens, CoT-Influx…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsAttention Is All You Need · Linear Layer · Adam · {Dispute@FaQ-s}How to file a dispute with Expedia? · Byte Pair Encoding · 15 Ways to Contact How can i speak to someone at Delta Airlines · Layer Normalization · Softmax · Residual Connection · Multi-Head Attention