Linear Classifier: An Often-Forgotten Baseline for Text Classification
Yu-Chen Lin, Si-An Chen, Jie-Jyun Liu, and Chih-Jen Lin

TL;DR
This paper emphasizes the importance of using simple linear classifiers as baselines in text classification tasks to verify the effectiveness of advanced models like BERT, highlighting their competitive performance and robustness.
Contribution
It advocates for including linear classifiers as baselines in text classification to ensure the validity of results from advanced models like BERT.
Findings
Linear classifiers perform competitively on many text datasets.
Advanced models require proper application to achieve optimal results.
Simple baselines help validate the effectiveness of complex models.
Abstract
Large-scale pre-trained language models such as BERT are popular solutions for text classification. Due to the superior performance of these advanced methods, nowadays, people often directly train them for a few epochs and deploy the obtained model. In this opinion paper, we point out that this way may only sometimes get satisfactory results. We argue the importance of running a simple baseline like linear classifiers on bag-of-words features along with advanced methods. First, for many text data, linear methods show competitive performance, high efficiency, and robustness. Second, advanced models such as BERT may only achieve the best results if properly applied. Simple baselines help to confirm whether the results of advanced models are acceptable. Our experimental results fully support these points.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
MethodsMulti-Head Attention · Linear Layer · Linear Warmup With Linear Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Layer Normalization · Weight Decay · Residual Connection · Softmax · Adam · Dropout
