Scalable and Distributed Individualized Treatment Rules for Massive Datasets
Nan Qiao, Wangcheng Li, Jingxiao Zhang, Canyi Chen

TL;DR
This paper introduces a scalable, privacy-preserving distributed method for learning individualized treatment rules using a convolution-smoothed SVM, achieving optimal performance with minimal communication across data sources.
Contribution
It develops a novel convolution-smoothed weighted SVM and an efficient distributed learning algorithm that preserves data privacy and reduces communication costs for multi-source ITRs.
Findings
Method achieves optimal statistical performance with fixed communication rounds.
Efficient coordinate gradient descent guarantees linear convergence.
Validated through simulations and real sepsis treatment data.
Abstract
Synthesizing information from multiple data sources is crucial for constructing accurate individualized treatment rules (ITRs). However, privacy concerns often present significant barriers to the integrative analysis of such multi-source data. Classical meta-learning, which averages local estimates to derive the final ITR, is frequently suboptimal due to biases in these local estimates. To address these challenges, we propose a convolution-smoothed weighted support vector machine for learning the optimal ITR. The accompanying loss function is both convex and smooth, which allows us to develop an efficient multi-round distributed learning procedure for ITRs. Such distributed learning ensures optimal statistical performance with a fixed number of communication rounds, thereby minimizing coordination costs across data centers while preserving data privacy. Our method avoids pooling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Privacy-Preserving Technologies in Data · Advanced Causal Inference Techniques
