L2V-CoT: Cross-Modal Transfer of Chain-of-Thought Reasoning via Latent Intervention
Yuliang Zhan, Xinyu Tang, Han Wan, Jian Li, Ji-Rong Wen, Hao Sun

TL;DR
L2V-CoT introduces a training-free method to transfer Chain-of-Thought reasoning from language models to vision-language models by manipulating low-frequency latent representations, significantly improving reasoning without additional training.
Contribution
The paper proposes L2V-CoT, a novel latent intervention technique that transfers CoT reasoning from LLMs to VLMs using frequency domain manipulation, bypassing the need for training or architectural alignment.
Findings
L2V-CoT outperforms training-free baselines in reasoning tasks.
L2V-CoT surpasses some supervised methods in performance.
Latent representations of CoT reasoning are shared across models despite architecture differences.
Abstract
Recently, Chain-of-Thought (CoT) reasoning has significantly enhanced the capabilities of large language models (LLMs), but Vision-Language Models (VLMs) still struggle with multi-step reasoning tasks due to limited multimodal reasoning data. To bridge this gap, researchers have explored methods to transfer CoT reasoning from LLMs to VLMs. However, existing approaches either need high training costs or require architectural alignment. In this paper, we use Linear Artificial Tomography (LAT) to empirically show that LLMs and VLMs share similar low-frequency latent representations of CoT reasoning despite architectural differences. Based on this insight, we propose L2V-CoT, a novel training-free latent intervention approach that transfers CoT reasoning from LLMs to VLMs. L2V-CoT extracts and resamples low-frequency CoT representations from LLMs in the frequency domain, enabling dimension…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Neurobiology of Language and Bilingualism
