ContraCLM: Contrastive Learning For Causal Language Model
Nihal Jain, Dejiao Zhang, Wasi Uddin Ahmad, Zijian Wang, Feng Nan,, Xiaopeng Li, Ming Tan, Ramesh Nallapati, Baishakhi Ray, Parminder Bhatia,, Xiaofei Ma, Bing Xiang

TL;DR
ContraCLM introduces a contrastive learning framework at token and sequence levels to improve the discrimination ability of causal language models, enhancing their performance on various downstream tasks.
Contribution
It is the first to apply contrastive learning at both token and sequence levels to causal language models, significantly improving their representation quality and task performance.
Findings
44% improvement on Semantic Textual Similarity
34% improvement on Code-to-Code Search
9% boost in source code generation accuracy
Abstract
Despite exciting progress in causal language models, the expressiveness of the representations is largely limited due to poor discrimination ability. To remedy this issue, we present ContraCLM, a novel contrastive learning framework at both token-level and sequence-level. We assess ContraCLM on a variety of downstream tasks. We show that ContraCLM enhances discrimination of the representations and bridges the gap with the encoder-only models, which makes causal language models better suited for tasks beyond language generation. Specifically, we attain relative improvement on the Semantic Textual Similarity tasks and on Code-to-Code Search tasks. Furthermore, by improving the expressiveness of the representations, ContraCLM also boosts the source code generation capability with relative improvement on execution accuracy on the HumanEval benchmark.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
MethodsContrastive Learning
