Classifying textual data: shallow, deep and ensemble methods

Laura Anderlucci; Lucia Guastadisegni; Cinzia Viroli

arXiv:1902.07068·cs.CL·February 20, 2019·5 cites

Classifying textual data: shallow, deep and ensemble methods

Laura Anderlucci, Lucia Guastadisegni, Cinzia Viroli

PDF

Open Access

TL;DR

This paper compares shallow, deep, and ensemble methods for text classification on high-dimensional, sparse data, showing deep learning's superiority and the ensemble's potential to enhance accuracy and robustness.

Contribution

It provides a comprehensive evaluation of modern text classification techniques, highlighting the benefits of combining shallow and deep learning methods in ensembles.

Findings

01

Deep learning outperforms classical methods

02

Ensemble classifiers improve accuracy and robustness

03

Combination of methods is effective for sparse, high-dimensional data

Abstract

This paper focuses on a comparative evaluation of the most common and modern methods for text classification, including the recent deep learning strategies and ensemble methods. The study is motivated by a challenging real data problem, characterized by high-dimensional and extremely sparse data, deriving from incoming calls to the customer care of an Italian phone company. We will show that deep learning outperforms many classical (shallow) strategies but the combination of shallow and deep learning methods in a unique ensemble classifier may improve the robustness and the accuracy of "single" classification methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies · Advanced Text Analysis Techniques · Topic Modeling