Scientific intuition inspired by machine learning generated hypotheses
Pascal Friederich, Mario Krenn, Isaac Tamblyn, Alan Aspuru-Guzik

TL;DR
This paper demonstrates how machine learning models, specifically gradient boosting decision trees, can be used to extract interpretable scientific insights and generate hypotheses in chemistry and quantum physics, enhancing human understanding.
Contribution
It introduces a method to derive scientific insights directly from machine learning models, moving beyond numerical predictions to hypothesis generation in scientific research.
Findings
Rediscovered known rules of thumb in chemistry
Identified new motifs controlling solubility and energy levels
Gained new understanding of quantum entanglement experiments
Abstract
Machine learning with application to questions in the physical sciences has become a widely used tool, successfully applied to classification, regression and optimization tasks in many areas. Research focus mostly lies in improving the accuracy of the machine learning models in numerical predictions, while scientific understanding is still almost exclusively generated by human researchers analysing numerical results and drawing conclusions. In this work, we shift the focus on the insights and the knowledge obtained by the machine learning models themselves. In particular, we study how it can be extracted and used to inspire human scientists to increase their intuitions and understanding of natural systems. We apply gradient boosting in decision trees to extract human interpretable insights from big data sets from chemistry and physics. In chemistry, we not only rediscover widely know…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
