Decentralized EM to Learn Gaussian Mixtures from Datasets Distributed by Features
Pedro Valdeira, Cl\'audia Soares, Jo\~ao Xavier

TL;DR
This paper introduces a decentralized EM algorithm for Gaussian mixture models that works with vertically partitioned data, enabling privacy-preserving, scalable clustering across distributed datasets with different features.
Contribution
It presents the first EM-based method for Gaussian mixtures on vertically partitioned data, extending federated learning to feature-distributed datasets.
Findings
VP-EM matches centralized EM in federated setups.
Consensus averaging enables EM approximation in peer-to-peer networks.
VP-EM outperforms existing benchmarks in accuracy.
Abstract
Expectation Maximization (EM) is the standard method to learn Gaussian mixtures. Yet its classic, centralized form is often infeasible, due to privacy concerns and computational and communication bottlenecks. Prior work dealt with data distributed by examples, horizontal partitioning, but we lack a counterpart for data scattered by features, an increasingly common scheme (e.g. user profiling with data from multiple entities). To fill this gap, we provide an EM-based algorithm to fit Gaussian mixtures to Vertically Partitioned data (VP-EM). In federated learning setups, our algorithm matches the centralized EM fitting of Gaussian mixtures constrained to a subspace. In arbitrary communication graphs, consensus averaging allows VP-EM to run on large peer-to-peer networks as an EM approximation. This mismatch comes from consensus error only, which vanishes exponentially fast with the number…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Bayesian Methods and Mixture Models · Complex Network Analysis Techniques
