Fuel Gauge: Estimating Chain-of-Thought Length Ahead of Time in Large Multimodal Models
Yuedong Yang, Xiwen Wei, Mustafa Munir, Radu Marculescu

TL;DR
Fuel Gauge predicts the length of reasoning chains in large multimodal models beforehand, enabling more efficient resource use and improved accuracy by preventing under- and over-thinking.
Contribution
This paper introduces Fuel Gauge, a novel method to estimate Chain-of-Thought length in large multimodal models, improving efficiency and performance.
Findings
Achieves less than half the CoT length prediction error on GPQA-Diamond.
Reduces memory allocation frequency by 13.37x.
Demonstrates effectiveness across text, image-text, and video-text benchmarks.
Abstract
Reasoning Large Multi-modality Models (LMMs) have become the de facto choice for many applications. However, these models rely on a Chain-of-Thought (CoT) process that is lengthy and unpredictable at runtime, often resulting in inefficient use of computational resources (due to memory fragmentation) and sub-optimal accuracy (due to under- and over-thinking). We observe empirically that the CoT process follows a very simple form, whose behavior is independent of the specific generated samples. This suggests that the CoT length can be estimated ahead of time based on a hidden parameter representing the amount of "fuel" available to support the reasoning process. Based on this insight, we propose Fuel Gauge, the first method which extracts this hidden signal and predicts CoT length ahead of time. We demonstrate the utility on the Fuel Gauge on two downstream tasks: predictive KV cache…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Explainable Artificial Intelligence (XAI)
