Ensembling Finetuned Language Models for Text Classification

Sebastian Pineda Arango; Maciej Janowski; Lennart Purucker; Arber; Zela; Frank Hutter; Josif Grabocka

arXiv:2410.19889·cs.CL·October 29, 2024

Ensembling Finetuned Language Models for Text Classification

Sebastian Pineda Arango, Maciej Janowski, Lennart Purucker, Arber, Zela, Frank Hutter, Josif Grabocka

PDF

Open Access 1 Repo

TL;DR

This paper explores how ensembling multiple finetuned large language models can enhance text classification performance, providing a new dataset and analysis of ensembling strategies to encourage broader adoption.

Contribution

It introduces a metadataset of predictions from five finetuned models across six datasets and evaluates various ensembling methods for text classification.

Findings

01

Ensembling improves classification accuracy across datasets.

02

Different ensembling strategies yield varying performance gains.

03

Ensembling can provide more reliable uncertainty estimates.

Abstract

Finetuning is a common practice widespread across different communities to adapt pretrained models to particular tasks. Text classification is one of these tasks for which many pretrained models are available. On the other hand, ensembles of neural networks are typically used to boost performance and provide reliable uncertainty estimates. However, ensembling pretrained models for text classification is not a well-studied avenue. In this paper, we present a metadataset with predictions from five large finetuned models on six datasets, and report results of different ensembling strategies from these predictions. Our results shed light on how ensembling can improve the performance of finetuned text classifiers and incentivize future adoption of ensembles in such tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sebastianpinedaar/finetuning_text_classifiers
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Text and Document Classification Technologies · Topic Modeling