Contrast Is All You Need
Burak Kilic, Florix Bex, Albert Gatt

TL;DR
This paper demonstrates that contrastive learning with SetFit improves legal classification accuracy in data-scarce, imbalanced scenarios and enhances the model's focus on legally relevant features compared to vanilla finetuning.
Contribution
It introduces the use of contrastive learning with SetFit for legal classification and shows its advantages over traditional finetuning in low-data settings.
Findings
Contrastive learning outperforms vanilla finetuning with fewer samples.
SetFit enhances the model's focus on legally informative features.
Contrastive approach improves classification confidence and interpretability.
Abstract
In this study, we analyze data-scarce classification scenarios, where available labeled legal data is small and imbalanced, potentially hurting the quality of the results. We focused on two finetuning objectives; SetFit (Sentence Transformer Finetuning), a contrastive learning setup, and a vanilla finetuning setup on a legal provision classification task. Additionally, we compare the features that are extracted with LIME (Local Interpretable Model-agnostic Explanations) to see which particular features contributed to the model's classification decisions. The results show that a contrastive setup with SetFit performed better than vanilla finetuning while using a fraction of the training samples. LIME results show that the contrastive learning approach helps boost both positive and negative features which are legally informative and contribute to the classification results. Thus a model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Imbalanced Data Classification Techniques
MethodsAttention Is All You Need · Layer Normalization · Absolute Position Encodings · Label Smoothing · Byte Pair Encoding · Linear Layer · Adam · Multi-Head Attention · Position-Wise Feed-Forward Layer · Residual Connection
