Unveiling the Generalization Power of Fine-Tuned Large Language Models
Haoran Yang, Yumeng Zhang, Jiaqi Xu, Hongyuan Lu, Pheng Ann Heng, Wai, Lam

TL;DR
This paper investigates how fine-tuning large language models impacts their inherent ability to generalize across different tasks and domains, revealing task-dependent effects and potential improvements through in-context learning.
Contribution
It provides a systematic analysis of the effects of fine-tuning on LLM generalization, highlighting differences between generation and classification tasks and proposing in-context learning as a beneficial strategy.
Findings
Fine-tuning affects generalization differently for generation and classification tasks.
In-context learning during fine-tuning can improve model generalization.
Models fine-tuned on generation tasks show enhanced cross-domain performance.
Abstract
While Large Language Models (LLMs) have demonstrated exceptional multitasking abilities, fine-tuning these models on downstream, domain-specific datasets is often necessary to yield superior performance on test sets compared to their counterparts without fine-tuning. However, the comprehensive effects of fine-tuning on the LLMs' generalization ability are not fully understood. This paper delves into the differences between original, unmodified LLMs and their fine-tuned variants. Our primary investigation centers on whether fine-tuning affects the generalization ability intrinsic to LLMs. To elaborate on this, we conduct extensive experiments across five distinct language tasks on various datasets. Our main findings reveal that models fine-tuned on generation and classification tasks exhibit dissimilar behaviors in generalizing to different domains and tasks. Intriguingly, we observe…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
