On the Impact of Fine-Tuning on Chain-of-Thought Reasoning
Elita Lobo, Chirag Agarwal, Himabindu Lakkaraju

TL;DR
This paper investigates how fine-tuning large language models affects their reasoning abilities, especially Chain-of-Thought reasoning, revealing that fine-tuning can decrease reasoning faithfulness across multiple datasets.
Contribution
It provides the first systematic analysis of fine-tuning's impact on LLM reasoning capabilities and faithfulness of Chain-of-Thought reasoning, addressing a critical gap in understanding.
Findings
Fine-tuning can decrease Chain-of-Thought reasoning faithfulness.
Fine-tuning impacts internal reasoning mechanisms of LLMs.
Study conducted across four diverse datasets.
Abstract
Large language models have emerged as powerful tools for general intelligence, showcasing advanced natural language processing capabilities that find applications across diverse domains. Despite their impressive performance, recent studies have highlighted the potential for significant enhancements in LLMs' task-specific performance through fine-tuning strategies like Reinforcement Learning with Human Feedback (RLHF), supervised fine-tuning (SFT), and Quantized Low-Rank Adapters (Q-LoRA) method. However, previous works have shown that while fine-tuning offers significant performance gains, it also leads to challenges such as catastrophic forgetting and privacy and safety risks. To this end, there has been little to no work in \textit{understanding the impact of fine-tuning on the reasoning capabilities of LLMs}. Our research investigates the effect of fine-tuning on the reasoning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsCognitive Science and Mapping · Advanced Text Analysis Techniques
