A distributed regression analysis application based on SAS software. Part I: Linear and logistic regression
Qoua L. Her, Yury Vilk, Jessica Young, Zilu Zhang, Jessica M., Malenfant, Sarah Malek, Sengwee Toh

TL;DR
This paper introduces a distributed regression analysis application compatible with Base SAS and SAS/STAT, enabling privacy-preserving multivariable regression analysis in large distributed data networks without requiring SAS/IML.
Contribution
It presents a novel DRA implementation for SAS software that works within existing modules, expanding privacy-preserving analysis capabilities in large data networks.
Findings
Successful testing of the DRA application within SAS environments.
Enables multivariable regression analysis using only summary data.
Applicable to linear and logistic regression in distributed data settings.
Abstract
Previous work has demonstrated the feasibility and value of conducting distributed regression analysis (DRA), a privacy-protecting analytic method that performs multivariable-adjusted regression analysis with only summary-level information from participating sites. To our knowledge, there are no DRA applications in SAS, the statistical software used by several large national distributed data networks (DDNs), including the Sentinel System and PCORnet. SAS/IML is available to perform the required matrix computations for DRA in the SAS system. However, not all data partners in these large DDNs have access to SAS/IML, which is licensed separately. In this first article of a two-paper series, we describe a DRA application developed for use in Base SAS and SAS/STAT modules for linear and logistic DRA within horizontally partitioned DDNs and its successful tests.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Bayesian Inference · Pesticide Residue Analysis and Safety · Advanced Statistical Methods and Models
