Semantic Classification of Tabular Datasets via Character-Level Convolutional Neural Networks
Paul Azunre, Craig Corcoran, Numa Dhamani, Jeffrey Gleason, Garrett, Honke, David Sullivan, Rebecca Ruppel, Sandeep Verma, Jonathon Morgan

TL;DR
This paper introduces a character-level CNN for semantic classification of tabular data, demonstrating its effectiveness across diverse tasks and domains, and leveraging transfer learning to adapt with minimal labeled data.
Contribution
The paper presents a novel character-level CNN approach combined with transfer learning for flexible semantic classification of tabular data, reducing labeled data needs.
Findings
Effective in classifying tabular data, age prediction, and spam detection
Transfer learning enhances model adaptability with less labeled data
Character-level analysis achieves competitive accuracy without metadata
Abstract
A character-level convolutional neural network (CNN) motivated by applications in "automated machine learning" (AutoML) is proposed to semantically classify columns in tabular data. Simulated data containing a set of base classes is first used to learn an initial set of weights. Hand-labeled data from the CKAN repository is then used in a transfer-learning paradigm to adapt the initial weights to a more sophisticated representation of the problem (e.g., including more classes). In doing so, realistic data imperfections are learned and the set of classes handled can be expanded from the base set with reduced labeled data and computing power requirements. Results show the effectiveness and flexibility of this approach in three diverse domains: semantic classification of tabular data, age prediction from social media posts, and email spam classification. In addition to providing further…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Authorship Attribution and Profiling
