Contrast Is All You Need

Burak Kilic; Florix Bex; Albert Gatt

arXiv:2307.02882·cs.CL·April 25, 2024

Contrast Is All You Need

Burak Kilic, Florix Bex, Albert Gatt

PDF

Open Access

TL;DR

This paper demonstrates that contrastive learning with SetFit improves legal classification accuracy in data-scarce, imbalanced scenarios and enhances the model's focus on legally relevant features compared to vanilla finetuning.

Contribution

It introduces the use of contrastive learning with SetFit for legal classification and shows its advantages over traditional finetuning in low-data settings.

Findings

01

Contrastive learning outperforms vanilla finetuning with fewer samples.

02

SetFit enhances the model's focus on legally informative features.

03

Contrastive approach improves classification confidence and interpretability.

Abstract

In this study, we analyze data-scarce classification scenarios, where available labeled legal data is small and imbalanced, potentially hurting the quality of the results. We focused on two finetuning objectives; SetFit (Sentence Transformer Finetuning), a contrastive learning setup, and a vanilla finetuning setup on a legal provision classification task. Additionally, we compare the features that are extracted with LIME (Local Interpretable Model-agnostic Explanations) to see which particular features contributed to the model's classification decisions. The results show that a contrastive setup with SetFit performed better than vanilla finetuning while using a fraction of the training samples. LIME results show that the contrastive learning approach helps boost both positive and negative features which are legally informative and contribute to the classification results. Thus a model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Law · Imbalanced Data Classification Techniques

MethodsAttention Is All You Need · Layer Normalization · Absolute Position Encodings · Label Smoothing · Byte Pair Encoding · Linear Layer · Adam · Multi-Head Attention · Position-Wise Feed-Forward Layer · Residual Connection