ALL-IN-1: Short Text Classification with One Model for All Languages

Barbara Plank

arXiv:1710.09589·cs.CL·October 27, 2017·1 cites

ALL-IN-1: Short Text Classification with One Model for All Languages

Barbara Plank

PDF

Open Access 1 Repo

TL;DR

ALL-IN-1 introduces a straightforward multilingual text classification model using SVMs and multilingual embeddings, achieving top performance without parallel data across four languages.

Contribution

The paper presents a simple, effective multilingual classification approach that does not rely on parallel data, outperforming other models in a shared task.

Findings

01

Ranked 1st out of 12 teams in the shared task

02

Effective across four diverse languages

03

Does not require parallel data for training

Abstract

We present ALL-IN-1, a simple model for multilingual text classification that does not require any parallel data. It is based on a traditional Support Vector Machine classifier exploiting multilingual word embeddings and character n-grams. Our model is simple, easily extendable yet very effective, overall ranking 1st (out of 12 teams) in the IJCNLP 2017 shared task on customer feedback analysis in four languages: English, French, Japanese and Spanish.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bplank/ijcnlp2017-customer-feedback
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text and Document Classification Technologies