Interpretable Machine Learning for Discovery: Statistical Challenges \& Opportunities
Genevera I. Allen, Luqin Gan, Lili Zheng

TL;DR
This paper reviews the role of interpretable machine learning in scientific discovery, emphasizing validation challenges and opportunities for trustworthy, reproducible data-driven insights across various domains.
Contribution
It provides a comprehensive overview of techniques, validation methods, and open challenges in using interpretable machine learning for discovery in large datasets.
Findings
Discusses types of discoveries in supervised and unsupervised settings
Reviews practical validation approaches like data-splitting and stability
Highlights theoretical results on model selection and uncertainty quantification
Abstract
New technologies have led to vast troves of large and complex datasets across many scientific domains and industries. People routinely use machine learning techniques to not only process, visualize, and make predictions from this big data, but also to make data-driven discoveries. These discoveries are often made using Interpretable Machine Learning, or machine learning models and techniques that yield human understandable insights. In this paper, we discuss and review the field of interpretable machine learning, focusing especially on the techniques as they are often employed to generate new knowledge or make discoveries from large data sets. We outline the types of discoveries that can be made using Interpretable Machine Learning in both supervised and unsupervised settings. Additionally, we focus on the grand challenge of how to validate these discoveries in a data-driven manner,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI)
MethodsFocus
