Layer by Layer: Uncovering Where Multi-Task Learning Happens in   Instruction-Tuned Large Language Models

Zheng Zhao; Yftah Ziser; Shay B. Cohen

arXiv:2410.20008·cs.CL·October 29, 2024

Layer by Layer: Uncovering Where Multi-Task Learning Happens in Instruction-Tuned Large Language Models

Zheng Zhao, Yftah Ziser, Shay B. Cohen

PDF

Open Access 1 Video

TL;DR

This paper investigates how instruction tuning affects the encoding of task-specific knowledge in large language models across over 60 NLP tasks, revealing layer-specific transitions from general to task-oriented representations.

Contribution

It provides a detailed analysis of where and how instruction tuning modifies task-specific information in LLMs, advancing understanding of multi-task learning mechanisms.

Findings

01

Some tasks are already encoded pre-training.

02

Instruction tuning enhances task-specific representations.

03

Identifies layers where models shift from general to task-specific features.

Abstract

Fine-tuning pre-trained large language models (LLMs) on a diverse array of tasks has become a common approach for building models that can solve various natural language processing (NLP) tasks. However, where and to what extent these models retain task-specific knowledge remains largely unexplored. This study investigates the task-specific information encoded in pre-trained LLMs and the effects of instruction tuning on their representations across a diverse set of over 60 NLP tasks. We use a set of matrix analysis tools to examine the differences between the way pre-trained and instruction-tuned LLMs store task-specific information. Our findings reveal that while some tasks are already encoded within the pre-trained LLMs, others greatly benefit from instruction tuning. Additionally, we pinpointed the layers in which the model transitions from high-level general representations to more…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Layer by Layer: Uncovering Where Multi-Task Learning Happens in Instruction-Tuned Large Language Models· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsSparse Evolutionary Training