TextZoo, a New Benchmark for Reconsidering Text Classification

Benyou Wang; Li Wang; Qikang Wei; Lichun Liu

arXiv:1802.03656·cs.CL·March 20, 2018

TextZoo, a New Benchmark for Reconsidering Text Classification

Benyou Wang, Li Wang, Qikang Wei, Lichun Liu

PDF

Open Access

TL;DR

TextZoo introduces a comprehensive benchmark for text classification, re-implementing over 20 models across 10 datasets to facilitate fair comparison and analysis of neural network components.

Contribution

It provides a unified benchmark for comparing diverse neural network models in text classification, highlighting their relative strengths and effects.

Findings

01

Re-implemented 20+ models across 10 datasets

02

Analyzed the effects of different neural network components

03

Provided insights into model performance and component contributions

Abstract

Text representation is a fundamental concern in Natural Language Processing, especially in text classification. Recently, many neural network approaches with delicate representation model (e.g. FASTTEXT, CNN, RNN and many hybrid models with attention mechanisms) claimed that they achieved state-of-art in specific text classification datasets. However, it lacks an unified benchmark to compare these models and reveals the advantage of each sub-components for various settings. We re-implement more than 20 popular text representation models for classification in more than 10 datasets. In this paper, we reconsider the text classification task in the perspective of neural network and get serval effects with analysis of the above results.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Text Analysis Techniques · Text and Document Classification Technologies