Federated Non-negative Matrix Factorization for Short Texts Topic Modeling with Mutual Information
Shijing Si, Jianzong Wang, Ruiyi Zhang, Qinliang Su, Jing Xiao

TL;DR
This paper introduces FedNMF, a federated learning framework for topic modeling on short texts that preserves privacy, and enhances performance with mutual information maximization, outperforming existing federated models.
Contribution
The paper proposes FedNMF+MI, a novel federated NMF approach that incorporates mutual information to improve topic modeling accuracy on heterogeneous short text data.
Findings
FedNMF+MI outperforms FedLDA and standard FedNMF in coherence and F1 score.
Mutual information maximization mitigates performance degradation in federated topic modeling.
The framework effectively preserves data privacy while maintaining high-quality topic extraction.
Abstract
Non-negative matrix factorization (NMF) based topic modeling is widely used in natural language processing (NLP) to uncover hidden topics of short text documents. Usually, training a high-quality topic model requires large amount of textual data. In many real-world scenarios, customer textual data should be private and sensitive, precluding uploading to data centers. This paper proposes a Federated NMF (FedNMF) framework, which allows multiple clients to collaboratively train a high-quality NMF based topic model with locally stored data. However, standard federated learning will significantly undermine the performance of topic models in downstream tasks (e.g., text classification) when the data distribution over clients is heterogeneous. To alleviate this issue, we further propose FedNMF+MI, which simultaneously maximizes the mutual information (MI) between the count features of local…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Recommender Systems and Techniques · Topic Modeling
