Semi-supervised Nonnegative Matrix Factorization for Document   Classification

Jamie Haddock; Lara Kassab; Sixian Li; Alona Kryshchenko; Rachel; Grotheer; Elena Sizikova; Chuntian Wang; Thomas Merkh; RWMA Madushani; Miju; Ahn; Deanna Needell; Kathryn Leonard

arXiv:2203.03551·cs.IR·March 8, 2022·1 cites

Semi-supervised Nonnegative Matrix Factorization for Document Classification

Jamie Haddock, Lara Kassab, Sixian Li, Alona Kryshchenko, Rachel, Grotheer, Elena Sizikova, Chuntian Wang, Thomas Merkh, RWMA Madushani, Miju, Ahn, Deanna Needell, Kathryn Leonard

PDF

Open Access

TL;DR

This paper introduces semi-supervised nonnegative matrix factorization models that jointly perform topic modeling and document classification, providing interpretable results and flexible application to various supervised learning tasks.

Contribution

The paper presents new SSNMF models with multiplicative update training methods, combining topic modeling and classification in a unified, interpretable framework.

Findings

01

Effective on 20 Newsgroups dataset

02

Applicable to multi-label classification

03

Flexible for other supervised tasks

Abstract

We propose new semi-supervised nonnegative matrix factorization (SSNMF) models for document classification and provide motivation for these models as maximum likelihood estimators. The proposed SSNMF models simultaneously provide both a topic model and a model for classification, thereby offering highly interpretable classification results. We derive training methods using multiplicative updates for each new model, and demonstrate the application of these models to single-label and multi-label document classification, although the models are flexible to other supervised learning tasks such as regression. We illustrate the promise of these models and training methods on document classification datasets (e.g., 20 Newsgroups, Reuters).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies · Face and Expression Recognition · Image Retrieval and Classification Techniques