Machine Learning in Epidemiology
Marvin N. Wright, Lukas Burk, Pegah Golchian, Jan Kapar, Niklas Koenen, Sophie Hanna Langbein

TL;DR
This paper reviews how machine learning techniques can be effectively applied in epidemiology, emphasizing methods, evaluation strategies, and interpretability, supported by practical R code examples using heart disease data.
Contribution
It provides a comprehensive methodological foundation for applying machine learning in epidemiology, including principles, methods, evaluation, and interpretability, with practical R examples.
Findings
Introduces core machine learning principles for epidemiology
Details strategies for model evaluation and hyperparameter tuning
Provides practical R code examples with heart disease data
Abstract
In the age of digital epidemiology, epidemiologists are faced by an increasing amount of data of growing complexity and dimensionality. Machine learning is a set of powerful tools that can help to analyze such enormous amounts of data. This chapter lays the methodological foundations for successfully applying machine learning in epidemiology. It covers the principles of supervised and unsupervised learning and discusses the most important machine learning methods. Strategies for model evaluation and hyperparameter optimization are developed and interpretable machine learning is introduced. All these theoretical parts are accompanied by code examples in R, where an example dataset on heart disease is used throughout the chapter.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare · Machine Learning in Healthcare · Statistical Methods in Epidemiology
