Scientific intuition inspired by machine learning generated hypotheses

Pascal Friederich; Mario Krenn; Isaac Tamblyn; Alan Aspuru-Guzik

arXiv:2010.14236·cs.LG·July 20, 2021

Scientific intuition inspired by machine learning generated hypotheses

Pascal Friederich, Mario Krenn, Isaac Tamblyn, Alan Aspuru-Guzik

PDF

TL;DR

This paper demonstrates how machine learning models, specifically gradient boosting decision trees, can be used to extract interpretable scientific insights and generate hypotheses in chemistry and quantum physics, enhancing human understanding.

Contribution

It introduces a method to derive scientific insights directly from machine learning models, moving beyond numerical predictions to hypothesis generation in scientific research.

Findings

01

Rediscovered known rules of thumb in chemistry

02

Identified new motifs controlling solubility and energy levels

03

Gained new understanding of quantum entanglement experiments

Abstract

Machine learning with application to questions in the physical sciences has become a widely used tool, successfully applied to classification, regression and optimization tasks in many areas. Research focus mostly lies in improving the accuracy of the machine learning models in numerical predictions, while scientific understanding is still almost exclusively generated by human researchers analysing numerical results and drawing conclusions. In this work, we shift the focus on the insights and the knowledge obtained by the machine learning models themselves. In particular, we study how it can be extracted and used to inspire human scientists to increase their intuitions and understanding of natural systems. We apply gradient boosting in decision trees to extract human interpretable insights from big data sets from chemistry and physics. In chemistry, we not only rediscover widely know…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.