Evaluation of Few-Shot Learning for Classification Tasks in the Polish   Language

Tsimur Hadeliya; Dariusz Kajtoch

arXiv:2404.17832·cs.CL·April 30, 2024

Evaluation of Few-Shot Learning for Classification Tasks in the Polish Language

Tsimur Hadeliya, Dariusz Kajtoch

PDF

Open Access

TL;DR

This paper evaluates few-shot learning methods for Polish language classification tasks, showing that in-context learning with models like GPT-3.5 and GPT-4 performs best, but still lags behind full fine-tuning, and introduces a new benchmark and templates.

Contribution

Introduces a Polish language few-shot benchmark with seven tasks, compares multiple methods, and provides insights into model pre-training and performance gaps.

Findings

01

ICL with GPT-3.5 and GPT-4 outperforms other methods.

02

SetFit and linear probing are competitive alternatives.

03

Pre-training on Polish improves model performance.

Abstract

We introduce a few-shot benchmark consisting of 7 different classification tasks native to the Polish language. We conducted an empirical comparison with 0 and 16 shots between fine-tuning, linear probing, SetFit, and in-context learning (ICL) using various pre-trained commercial and open-source models. Our findings reveal that ICL achieves the best performance, with commercial models like GPT-3.5 and GPT-4 attaining the best performance. However, there remains a significant 14 percentage points gap between our best few-shot learning score and the performance of HerBERT-large fine-tuned on the entire training dataset. Among the techniques, SetFit emerges as the second-best approach, closely followed by linear probing. We observed the worst and most unstable performance with non-linear head fine-tuning. Results for ICL indicate that continual pre-training of models like Mistral-7b or…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLanguage and Culture · Interpreting and Communication in Healthcare

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Linear Layer · Label Smoothing · Adam · Layer Normalization · Attention Dropout