Modelling Complex Survey Data Using R, SAS, SPSS and Stata: A Comparison Using CLSA Datasets
Hon Yiu So, Urun Erbas Oz, Lauren Griffith, Susan Kirkland, Jinhua Ma,, Parminder Raina, Nazmul Sohel, Mary E. Thompson, Christina Wolfson and, Changbao Wu

TL;DR
This paper compares the use of R, SAS, SPSS, and Stata for analyzing complex survey data from the CLSA, providing detailed procedures and code to assist researchers in health sciences.
Contribution
It offers a comprehensive comparison and detailed coding guide for analyzing survey data across multiple statistical software packages, promoting R usage in health research.
Findings
R survey package effectively analyzes complex survey data
Detailed R codes facilitate analysis comparable to commercial software
The paper encourages adoption of R in health survey analysis
Abstract
The R software has become popular among researchers due to its flexibility and open-source nature. However, researchers in the fields of public health and epidemiological studies are more customary to commercial statistical softwares such as SAS, SPSS and Stata. This paper provides a comprehensive comparison on analysis of health survey data using the R survey package, SAS, SPSS and Stata. We describe detailed R codes and procedures for other software packages on commonly encountered statistical analyses, such as estimation of population means and regression analysis, using datasets from the Canadian Longitudinal Study on Aging (CLSA). It is hoped that the paper stimulates interest among health science researchers to carry data analysis using R and also serves as a cookbook for statistical analysis using different software packages.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNutritional Studies and Diet
