Wrangling Real-World Data: Optimizing Clinical Research Through Factor Selection with LASSO Regression
Kerry A. Howard, Wes Anderson, Jagdeep T. Podichetty, Ruth Gould, Danielle Boyce, Pam Dasher, Laura Evans, Cindy Kao, Vishakha K. Kumar, Chase Hamilton, Ewy Mathé, Philippe J. Guerin, Kenneth Dodd, Aneesh K. Mehta, Chris Ortman, Namrata Patil, Jeselyn Rhodes, Matthew Robinson

TL;DR
This paper explores how using real-world clinical data and LASSO regression can help identify key factors affecting mortality in hospitalized COVID-19 patients.
Contribution
The study demonstrates how collaborative platforms like CURE ID and LASSO regression can streamline research and drug-repurposing efforts for infectious diseases.
Findings
Age, lab measures, severity indicators, oxygen support, and comorbidities significantly influence 28-day mortality in hospitalized COVID-19 patients.
Collaborative repositories like CURE ID provide robust datasets for prognostic research.
Factor selection methods like LASSO regression help identify key variables for streamlined research.
Abstract
Data-driven approaches to clinical research are necessary for understanding and effectively treating infectious diseases. However, challenges such as issues with data validity, lack of collaboration, and difficult-to-treat infectious diseases (e.g., those that are rare or newly emerging) hinder research. Prioritizing innovative methods to facilitate the continued use of data generated during routine clinical care for research, but in an organized, accelerated, and shared manner, is crucial. This study investigates the potential of CURE ID, an open-source platform to accelerate drug-repurposing research for difficult-to-treat diseases, with COVID-19 as a use case. Data from eight US health systems were analyzed using least absolute shrinkage and selection operator (LASSO) regression to identify key predictors of 28-day all-cause mortality in COVID-19 patients, including demographics,…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · COVID-19 diagnosis using AI · Artificial Intelligence in Healthcare
