The Parrot Dilemma: Human-Labeled vs. LLM-augmented Data in   Classification Tasks

Anders Giovanni M{\o}ller; Jacob Aarup Dalsgaard; Arianna Pera; Luca; Maria Aiello

arXiv:2304.13861·cs.CL·February 6, 2024·26 cites

The Parrot Dilemma: Human-Labeled vs. LLM-augmented Data in Classification Tasks

Anders Giovanni M{\o}ller, Jacob Aarup Dalsgaard, Arianna Pera, Luca, Maria Aiello

PDF

Open Access 2 Repos

TL;DR

This study compares human-labeled and GPT-4/Llama-2 synthetic data in CSS classification tasks, finding human data generally outperforms synthetic data, but augmentation helps with rare classes, and large language models excel in zero-shot settings.

Contribution

It provides guidelines for data annotation in CSS, evaluating synthetic data's effectiveness and comparing LLM-based zero-shot classification to traditional classifiers.

Findings

01

Human-labeled data outperforms synthetic data in most cases.

02

Synthetic augmentation improves rare class performance.

03

LLMs perform well in zero-shot classification but lag behind trained classifiers.

Abstract

In the realm of Computational Social Science (CSS), practitioners often navigate complex, low-resource domains and face the costly and time-intensive challenges of acquiring and annotating data. We aim to establish a set of guidelines to address such challenges, comparing the use of human-labeled data with synthetically generated data from GPT-4 and Llama-2 in ten distinct CSS classification tasks of varying complexity. Additionally, we examine the impact of training data sizes on performance. Our findings reveal that models trained on human-labeled data consistently exhibit superior or comparable performance compared to their synthetically augmented counterparts. Nevertheless, synthetic augmentation proves beneficial, particularly in improving performance on rare classes within multi-class tasks. Furthermore, we leverage GPT-4 and Llama-2 for zero-shot classification and find that,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification

MethodsAttention Is All You Need · Test · Linear Layer · Adam · Layer Normalization · Dense Connections · Label Smoothing · Dropout · Absolute Position Encodings · Multi-Head Attention