ContraCLM: Contrastive Learning For Causal Language Model

Nihal Jain; Dejiao Zhang; Wasi Uddin Ahmad; Zijian Wang; Feng Nan,; Xiaopeng Li; Ming Tan; Ramesh Nallapati; Baishakhi Ray; Parminder Bhatia,; Xiaofei Ma; Bing Xiang

arXiv:2210.01185·cs.CL·May 4, 2023

ContraCLM: Contrastive Learning For Causal Language Model

Nihal Jain, Dejiao Zhang, Wasi Uddin Ahmad, Zijian Wang, Feng Nan,, Xiaopeng Li, Ming Tan, Ramesh Nallapati, Baishakhi Ray, Parminder Bhatia,, Xiaofei Ma, Bing Xiang

PDF

Open Access 1 Repo

TL;DR

ContraCLM introduces a contrastive learning framework at token and sequence levels to improve the discrimination ability of causal language models, enhancing their performance on various downstream tasks.

Contribution

It is the first to apply contrastive learning at both token and sequence levels to causal language models, significantly improving their representation quality and task performance.

Findings

01

44% improvement on Semantic Textual Similarity

02

34% improvement on Code-to-Code Search

03

9% boost in source code generation accuracy

Abstract

Despite exciting progress in causal language models, the expressiveness of the representations is largely limited due to poor discrimination ability. To remedy this issue, we present ContraCLM, a novel contrastive learning framework at both token-level and sequence-level. We assess ContraCLM on a variety of downstream tasks. We show that ContraCLM enhances discrimination of the representations and bridges the gap with the encoder-only models, which makes causal language models better suited for tasks beyond language generation. Specifically, we attain $44%$ relative improvement on the Semantic Textual Similarity tasks and $34%$ on Code-to-Code Search tasks. Furthermore, by improving the expressiveness of the representations, ContraCLM also boosts the source code generation capability with $9%$ relative improvement on execution accuracy on the HumanEval benchmark.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

amazon-science/contraclm
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsContrastive Learning