Leveraging Optimal Transport for Distributed Two-Sample Testing: An Integrated Transportation Distance-based Framework
Zhengqi Lin, Yan Chen

TL;DR
This paper proposes a new distributed two-sample testing framework using the Integrated Transportation Distance, enabling effective detection of distributional changes in decentralized data settings while maintaining privacy.
Contribution
It introduces the ITD framework based on optimal transport for distributed two-sample testing, with theoretical analysis and practical permutation testing procedures.
Findings
ITD achieves robust Type I error control
High power in detecting subtle distributional shifts
Effective in heterogeneous, high-dimensional data environments
Abstract
This paper introduces a novel framework for distributed two-sample testing using the Integrated Transportation Distance (ITD), an extension of the Optimal Transport distance. The approach addresses the challenges of detecting distributional changes in decentralized learning or federated learning environments, where data privacy and heterogeneity are significant concerns. We provide theoretical foundations for the ITD, including convergence properties and asymptotic behavior. A permutation test procedure is proposed for practical implementation in distributed settings, allowing for efficient computation while preserving data privacy. The framework's performance is demonstrated through theoretical power analysis and extensive simulations, showing robust Type I error control and high power across various distributions and dimensions. The results indicate that ITD effectively aggregates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Random Matrices and Applications · Distributed Sensor Networks and Detection Algorithms
