Zero and Few-shot Learning for Author Profiling

Mara Chinea-Rios; Thomas M\"uller; Gretel Liz De la Pe\~na; Sarrac\'en; Francisco Rangel; Marc Franco-Salvador

arXiv:2204.10543·cs.CL·May 18, 2022·1 cites

Zero and Few-shot Learning for Author Profiling

Mara Chinea-Rios, Thomas M\"uller, Gretel Liz De la Pe\~na, Sarrac\'en, Francisco Rangel, Marc Franco-Salvador

PDF

Open Access

TL;DR

This paper investigates zero and few-shot learning methods for author profiling, demonstrating that entailment-based models outperform traditional classifiers and achieve high accuracy with minimal training data in multiple languages.

Contribution

It introduces entailment-based approaches for low-resource author profiling and evaluates their effectiveness across different languages and data sizes.

Findings

01

Entailment models outperform supervised classifiers.

02

Achieve 80% of previous accuracy with less than 50% training data.

03

Effective in both Spanish and English author profiling tasks.

Abstract

Author profiling classifies author characteristics by analyzing how language is shared among people. In this work, we study that task from a low-resource viewpoint: using little or no training data. We explore different zero and few-shot models based on entailment and evaluate our systems on several profiling tasks in Spanish and English. In addition, we study the effect of both the entailment hypothesis and the size of the few-shot training sample. We find that entailment-based models out-perform supervised text classifiers based on roberta-XLM and that we can reach 80% of the accuracy of previous approaches using less than 50\% of the training data on average.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuthorship Attribution and Profiling · Spam and Phishing Detection · Hate Speech and Cyberbullying Detection