Towards Zero-Shot Functional Compositionality of Language Models

Hangyeol Yu; Myeongho Jeong; Jamin Shin; Hyeongdon Moon; Juneyoung; Park; Seungtaek Choi

arXiv:2303.03103·cs.CL·March 7, 2023·1 cites

Towards Zero-Shot Functional Compositionality of Language Models

Hangyeol Yu, Myeongho Jeong, Jamin Shin, Hyeongdon Moon, Juneyoung, Park, Seungtaek Choi

PDF

Open Access 1 Repo

TL;DR

This paper highlights the lack of functional compositionality in current large pre-trained language models and discusses the importance of developing models capable of zero-shot task composition to better emulate human intelligence.

Contribution

The paper identifies the gap in current PLMs regarding functional compositionality and proposes research directions to achieve zero-shot compositional capabilities.

Findings

01

Current PLMs lack functional compositionality.

02

PLMs are far from human-level generalizability in task composition.

03

The paper suggests future research directions for zero-shot compositionality.

Abstract

Large Pre-trained Language Models (PLM) have become the most desirable starting point in the field of NLP, as they have become remarkably good at solving many individual tasks. Despite such success, in this paper, we argue that current paradigms of working with PLMs are neglecting a critical aspect of modeling human intelligence: functional compositionality. Functional compositionality - the ability to compose learned tasks - has been a long-standing challenge in the field of AI (and many other fields) as it is considered one of the hallmarks of human intelligence. An illustrative example of such is cross-lingual summarization, where a bilingual person (English-French) could directly summarize an English document into French sentences without having to translate the English document or summary into French explicitly. We discuss why this matter is an important open problem that requires…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jshin49/tclm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Softmax · Dense Connections · Weight Decay · Adam · Dropout