TL;DR
SemCoT introduces a novel framework that accelerates Chain-of-Thought reasoning by generating semantically aligned implicit tokens, improving efficiency without sacrificing reasoning accuracy.
Contribution
It presents a combined approach of semantic alignment evaluation and lightweight implicit reasoning generation, addressing key challenges in implicit CoT methods.
Findings
SemCoT outperforms state-of-the-art methods in efficiency and effectiveness.
Semantic alignment improves reasoning accuracy.
Lightweight generator reduces reasoning token generation time.
Abstract
The verbosity of Chain-of-Thought (CoT) reasoning hinders its mass deployment in efficiency-critical applications. Recently, implicit CoT approaches have emerged, which encode reasoning steps within LLM's hidden embeddings (termed ``implicit reasoning'') rather than explicit tokens. This approach accelerates CoT by reducing the reasoning length and bypassing some LLM components. However, existing implicit CoT methods face two significant challenges: (1) they fail to preserve the semantic alignment between the implicit reasoning (when transformed to natural language) and the ground-truth reasoning, resulting in a significant CoT performance degradation, and (2) they focus on reducing the length of the implicit reasoning; however, they neglect the considerable time cost for an LLM to generate one individual implicit reasoning token. To tackle these challenges, we propose a novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗jonathanhe123/SemCoT-Sheared-LLaMA-1.3B-coin_flipmodel· 1 dl· ♡ 11 dl♡ 1
- 🤗jonathanhe123/SemCoT-Sheared-LLaMA-1.3B-commonsense_qamodel· 1 dl· ♡ 11 dl♡ 1
- 🤗jonathanhe123/SemCoT-Sheared-LLaMA-1.3B-gsm8kmodel· 4 dl· ♡ 14 dl♡ 1
- 🤗jonathanhe123/SemCoT-Sheared-LLaMA-1.3B-multiarithmodel· 1 dl· ♡ 11 dl♡ 1
- 🤗jonathanhe123/SemCoT-Sheared-LLaMA-1.3B-svampmodel· 4 dl· ♡ 14 dl♡ 1
- 🤗jonathanhe123/SemCoT-mistral-1.1b-coin_flipmodel· 1 dl· ♡ 11 dl♡ 1
- 🤗jonathanhe123/SemCoT-mistral-1.1b-commonsense_qamodel· 3 dl· ♡ 13 dl♡ 1
- 🤗jonathanhe123/SemCoT-mistral-1.1b-gsm8kmodel· 3 dl· ♡ 13 dl♡ 1
- 🤗jonathanhe123/SemCoT-mistral-1.1b-multiarithmodel· 1 dl· ♡ 11 dl♡ 1
- 🤗jonathanhe123/SemCoT-mistral-1.1b-svampmodel· 5 dl· ♡ 15 dl♡ 1
Videos
