DOC: Deep Open Classification of Text Documents

Lei Shu; Hu Xu; Bing Liu

arXiv:1709.08716·cs.CL·September 27, 2017·24 cites

DOC: Deep Open Classification of Text Documents

Lei Shu, Hu Xu, Bing Liu

PDF

Open Access

TL;DR

This paper introduces a deep learning approach for open-world text classification that effectively identifies novel documents not seen during training, outperforming existing methods significantly.

Contribution

It presents a novel deep learning method for open classification of text documents, addressing the challenge of detecting unseen classes during testing.

Findings

01

Outperforms state-of-the-art techniques dramatically

02

Effective in identifying novel, unseen documents

03

Advances open-world text classification capabilities

Abstract

Traditional supervised learning makes the closed-world assumption that the classes appeared in the test data must have appeared in training. This also applies to text learning or text classification. As learning is used increasingly in dynamic open environments where some new/test documents may not belong to any of the training classes, identifying these novel documents during classification presents an important problem. This problem is called open-world classification or open classification. This paper proposes a novel deep learning based approach. It outperforms existing state-of-the-art techniques dramatically.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies · Anomaly Detection Techniques and Applications · Imbalanced Data Classification Techniques