Cluster & Tune: Boost Cold Start Performance in Text Classification

Eyal Shnarch; Ariel Gera; Alon Halfon; Lena Dankin; Leshem Choshen,; Ranit Aharonov; Noam Slonim

arXiv:2203.10581·cs.CL·March 22, 2022·1 cites

Cluster & Tune: Boost Cold Start Performance in Text Classification

Eyal Shnarch, Ariel Gera, Alon Halfon, Lena Dankin, Leshem Choshen,, Ranit Aharonov, Noam Slonim

PDF

Open Access 1 Repo

TL;DR

This paper proposes an intermediate clustering step to improve the performance of pre-trained models like BERT in text classification tasks with limited labeled data, significantly enhancing results in cold start scenarios.

Contribution

Introducing an unsupervised clustering-based intermediate task to boost fine-tuning performance in low-data text classification settings.

Findings

01

Clustering improves classification accuracy in cold start scenarios.

02

Performance gains are most notable in topical classification tasks.

03

Method is effective with only a few dozen to a few hundred labeled examples.

Abstract

In real-world scenarios, a text classification task often begins with a cold start, when labeled data is scarce. In such cases, the common practice of fine-tuning pre-trained models, such as BERT, for a target classification task, is prone to produce poor performance. We suggest a method to boost the performance of such models by adding an intermediate unsupervised classification task, between the pre-training and fine-tuning phases. As such an intermediate task, we perform clustering and train the pre-trained model on predicting the cluster labels. We test this hypothesis on various data sets, and show that this additional classification phase can significantly improve performance, mainly for topical classification tasks, when the number of labeled instances available for fine-tuning is only a couple of dozen to a few hundred.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ibm/intermediate-training-using-clustering
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Text and Document Classification Technologies · Machine Learning and Data Classification

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Attention Dropout · Residual Connection · Linear Warmup With Linear Decay · Dense Connections · Weight Decay · WordPiece · Refunds@Expedia|||How do I get a full refund from Expedia?