Inference Scaling vs Reasoning: An Empirical Analysis of Compute-Optimal   LLM Problem-Solving

Marwan AbdElhameed; Pavly Halim

arXiv:2412.16260·cs.LG·December 24, 2024

Inference Scaling vs Reasoning: An Empirical Analysis of Compute-Optimal LLM Problem-Solving

Marwan AbdElhameed, Pavly Halim

PDF

Open Access 1 Repo

TL;DR

This paper empirically analyzes the trade-offs between reasoning accuracy and computational efficiency in large language models, revealing challenges in integrating these objectives and highlighting the need for new architectures.

Contribution

It provides a comprehensive empirical study of combining reasoning and efficiency methods in LLMs, demonstrating the complex interplay and limitations of current approaches.

Findings

01

Quiet-STaR achieves high accuracy but with high computational cost

02

REBASE offers efficiency with baseline accuracy

03

Combining methods degrades performance, indicating fundamental challenges

Abstract

Recent advances in large language models (LLMs) have predominantly focused on maximizing accuracy and reasoning capabilities, often overlooking crucial computational efficiency considerations. While this approach has yielded impressive accuracy improvements, it has led to methods that may be impractical for real-world deployment due to computational overhead and latency constraints. This paper investigates the potential synergy between reasoning enhancement and computational efficiency by analyzing the integration of two contrasting approaches: Quiet-STaR (Self-Taught Reasoner) and REBASE (REward BAlanced SEarch). Through comprehensive empirical analysis using the Mistral-7B model on the GSM8K dataset, we demonstrate that while each method excels in its primary objective-Quiet-STaR achieving superior accuracy (32.03%) despite high computational cost (554.66s runtime, 12.73T FLOPs), and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

marwanwalid2/reasoning-vs-inferncescaling
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Natural Language Processing Techniques