Expediting and Elevating Large Language Model Reasoning via Hidden   Chain-of-Thought Decoding

Tianqiao Liu; Zui Chen; Zitao Liu; Mi Tian; Weiqi Luo

arXiv:2409.08561·cs.CL·September 16, 2024

Expediting and Elevating Large Language Model Reasoning via Hidden Chain-of-Thought Decoding

Tianqiao Liu, Zui Chen, Zitao Liu, Mi Tian, Weiqi Luo

PDF

Open Access

TL;DR

This paper introduces a semantic compression method for chain-of-thought reasoning in large language models, significantly reducing inference time while maintaining or improving reasoning accuracy across multiple domains.

Contribution

It proposes a novel semantic alignment approach to compress CoT processes, enabling faster decoding without sacrificing reasoning quality.

Findings

01

Achieves at least 1.5x speedup in decoding time.

02

Maintains or improves task accuracy across domains.

03

Enhances compression quality with contrastive learning.

Abstract

Large language models (LLMs) have demonstrated remarkable capabilities in tasks requiring reasoning and multi-step problem-solving through the use of chain-of-thought (CoT) prompting. However, generating the full CoT process results in significantly longer output sequences, leading to increased computational costs and latency during inference. To address this challenge, we propose a novel approach to compress the CoT process through semantic alignment, enabling more efficient decoding while preserving the benefits of CoT reasoning. Our method introduces an auxiliary CoT model that learns to generate and compress the full thought process into a compact special token representation semantically aligned with the original CoT output. This compressed representation is then integrated into the input of the Hidden Chain-of-Thought (HCoT) model. The training process follows a two-stage…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Modeling and Causal Inference

MethodsChain-of-thought prompting · Contrastive Learning