Semi-orthogonal Non-negative Matrix Factorization with an Application in Text Mining
Jack Yutong Li, Ruoqing Zhu, Annie Qu, Han Ye, Zhankun Sun

TL;DR
This paper introduces semi-orthogonal non-negative matrix factorization (SONMF), a novel method for bi-clustering high-dimensional, noisy text data in medical applications, improving interpretability and classification accuracy.
Contribution
The paper proposes SONMF, a new matrix factorization technique that enhances interpretability and classification performance on complex text data in healthcare.
Findings
SONMF outperforms existing NMF methods in factorization accuracy
The method increases classification accuracy over traditional models
SONMF demonstrates faster convergence and better orthogonality in experiments
Abstract
Emergency Department (ED) crowding is a worldwide issue that affects the efficiency of hospital management and the quality of patient care. This occurs when the request for an admit ward-bed to receive a patient is delayed until an admission decision is made by a doctor. To reduce the overcrowding and waiting time of ED, we build a classifier to predict the disposition of patients using manually-typed nurse notes collected during triage, thereby allowing hospital staff to begin necessary preparation beforehand. However, these triage notes involve high dimensional, noisy, and also sparse text data which makes model fitting and interpretation difficult. To address this issue, we propose the semi-orthogonal non-negative matrix factorization (SONMF) for both continuous and binary design matrices to first bi-cluster the patients and words into a reduced number of topics. The subjects can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech Recognition and Synthesis · Text and Document Classification Technologies
