Imbalanced Multi-label Classification for Business-related Text with Moderately Large Label Spaces
Muhammad Arslan, Christophe Cruz

TL;DR
This paper compares four multi-label text classification methods on an imbalanced business dataset, finding that fine-tuned BERT significantly outperforms traditional methods in accuracy and other metrics.
Contribution
It demonstrates the superior performance of fine-tuned BERT over traditional multi-label classification methods on an imbalanced business text dataset.
Findings
Fine-tuned BERT achieves the highest accuracy and F1 scores.
Binary Relevance performs well, but less than BERT.
Classifier Chains and Label Powerset underperform on this dataset.
Abstract
In this study, we compared the performance of four different methods for multi label text classification using a specific imbalanced business dataset. The four methods we evaluated were fine tuned BERT, Binary Relevance, Classifier Chains, and Label Powerset. The results show that fine tuned BERT outperforms the other three methods by a significant margin, achieving high values of accuracy, F1 Score, Precision, and Recall. Binary Relevance also performs well on this dataset, while Classifier Chains and Label Powerset demonstrate relatively poor performance. These findings highlight the effectiveness of fine tuned BERT for multi label text classification tasks, and suggest that it may be a useful tool for businesses seeking to analyze complex and multifaceted texts.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Sentiment Analysis and Opinion Mining · Imbalanced Data Classification Techniques
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Linear Warmup With Linear Decay · Layer Normalization · Weight Decay · Residual Connection · Softmax · Adam
