Prompting Encoder Models for Zero-Shot Classification: A Cross-Domain   Study in Italian

Serena Auriemma; Martina Miliani; Mauro Madeddu; Alessandro Bondielli,; Lucia Passaro; Alessandro Lenci

arXiv:2407.20654·cs.CL·July 31, 2024

Prompting Encoder Models for Zero-Shot Classification: A Cross-Domain Study in Italian

Serena Auriemma, Martina Miliani, Mauro Madeddu, Alessandro Bondielli,, Lucia Passaro, Alessandro Lenci

PDF

Open Access

TL;DR

This study investigates the use of smaller, domain-specific encoder models with prompting techniques to improve zero-shot classification in Italian legal and bureaucratic language, highlighting their advantages in low-resource scenarios.

Contribution

It demonstrates that domain-specific encoder models, when combined with prompting and calibration, outperform general models in specialized Italian tasks, especially in zero-shot settings.

Findings

01

Pre-trained domain-specific models show better adaptability for Italian legal tasks.

02

Calibration techniques significantly improve model performance.

03

Domain models are advantageous when in-domain resources are limited.

Abstract

Addressing the challenge of limited annotated data in specialized fields and low-resource languages is crucial for the effective use of Language Models (LMs). While most Large Language Models (LLMs) are trained on general-purpose English corpora, there is a notable gap in models specifically tailored for Italian, particularly for technical and bureaucratic jargon. This paper explores the feasibility of employing smaller, domain-specific encoder LMs alongside prompting techniques to enhance performance in these specialized contexts. Our study concentrates on the Italian bureaucratic and legal language, experimenting with both general-purpose and further pre-trained encoder-only models. We evaluated the models on downstream tasks such as document classification and entity typing and conducted intrinsic evaluations using Pseudo-Log-Likelihood. The results indicate that while further…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis