Evaluating LLM Prompts for Data Augmentation in Multi-label   Classification of Ecological Texts

Anna Glazkova; Olga Zakharova

arXiv:2411.14896·cs.CL·March 4, 2025

Evaluating LLM Prompts for Data Augmentation in Multi-label Classification of Ecological Texts

Anna Glazkova, Olga Zakharova

PDF

Open Access

TL;DR

This paper evaluates prompt-based data augmentation techniques using large language models to improve multi-label classification of ecological texts, specifically green practices in Russian social media, showing significant performance gains.

Contribution

It introduces and compares various prompt-based data augmentation strategies for ecological text classification, demonstrating their effectiveness over traditional fine-tuning methods.

Findings

01

All augmentation strategies improved classification accuracy.

02

Paraphrasing prompts yielded the best results.

03

Augmentation outperformed baseline models without data enhancement.

Abstract

Large language models (LLMs) play a crucial role in natural language processing (NLP) tasks, improving the understanding, generation, and manipulation of human language across domains such as translating, summarizing, and classifying text. Previous studies have demonstrated that instruction-based LLMs can be effectively utilized for data augmentation to generate diverse and realistic text samples. This study applied prompt-based data augmentation to detect mentions of green practices in Russian social media. Detecting green practices in social media aids in understanding their prevalence and helps formulate recommendations for scaling eco-friendly actions to mitigate environmental issues. We evaluated several prompts for augmenting texts in a multi-label classification task, either by rewriting existing datasets using LLMs, generating new data, or combining both approaches. Our results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Text Analysis Techniques · Text and Document Classification Technologies