Language Models can Exploit Cross-Task In-context Learning for   Data-Scarce Novel Tasks

Anwoy Chatterjee; Eshaan Tanwar; Subhabrata Dutta; Tanmoy Chakraborty

arXiv:2405.10548·cs.CL·June 13, 2024

Language Models can Exploit Cross-Task In-context Learning for Data-Scarce Novel Tasks

Anwoy Chatterjee, Eshaan Tanwar, Subhabrata Dutta, Tanmoy Chakraborty

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper demonstrates that large language models can leverage cross-task in-context learning to improve performance on novel tasks without direct examples, enabling better generalization across tasks.

Contribution

It introduces a novel cross-task prompting method that allows LLMs to generalize from existing task examples to new tasks, showing significant performance improvements.

Findings

01

Cross-task prompting boosts LLM performance significantly.

02

Performance gains are correlated with activation similarities between tasks.

03

Pseudo-labeling further enhances in-task learning effectiveness.

Abstract

Large Language Models (LLMs) have transformed NLP with their remarkable In-context Learning (ICL) capabilities. Automated assistants based on LLMs are gaining popularity; however, adapting them to novel tasks is still challenging. While colossal models excel in zero-shot performance, their computational demands limit widespread use, and smaller language models struggle without context. This paper investigates whether LLMs can generalize from labeled examples of predefined tasks to novel tasks. Drawing inspiration from biological neurons and the mechanistic interpretation of the Transformer architecture, we explore the potential for information sharing across tasks. We design a cross-task prompting setup with three LLMs and show that LLMs achieve significant performance improvements despite no examples from the target task in the context. Cross-task prompting leads to a remarkable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

c-anwoy/cross-task-icl
pytorchOfficial

Videos

Language Models can Exploit Cross-Task In-context Learning for Data-Scarce Novel Tasks· underline

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Data Stream Mining Techniques

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Dense Connections · Cosine Annealing · Linear Layer · Position-Wise Feed-Forward Layer · Weight Decay · Linear Warmup With Cosine Annealing · Label Smoothing · Residual Connection