Breiman's two cultures: You don't have to choose sides
Andrew C. Miller, Nicholas J. Foti, Emily B. Fox

TL;DR
This paper expands Breiman's dichotomy of data analysis cultures by introducing a third, scientific-mechanistic approach and discusses how modern tools enable hybrid models that balance interpretability, scientific knowledge, and predictive accuracy.
Contribution
It proposes a third data analysis culture based on scientific models and explores how hybrid models can overcome traditional limitations of the two original cultures.
Findings
Hybrid models can achieve accurate and robust predictions.
Modern computational tools enable interpolation between model cultures.
Hybrid approaches can address issues like the Rashomon effect and Occam's dilemma.
Abstract
Breiman's classic paper casts data analysis as a choice between two cultures: data modelers and algorithmic modelers. Stated broadly, data modelers use simple, interpretable models with well-understood theoretical properties to analyze data. Algorithmic modelers prioritize predictive accuracy and use more flexible function approximations to analyze data. This dichotomy overlooks a third set of models mechanistic models derived from scientific theories (e.g., ODE/SDE simulators). Mechanistic models encode application-specific scientific knowledge about the data. And while these categories represent extreme points in model space, modern computational and algorithmic tools enable us to interpolate between these points, producing flexible, interpretable, and scientifically-informed hybrids that can enjoy accurate and robust predictions, and resolve issues with data analysis that Breiman…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Simulation Techniques and Applications · Data Visualization and Analytics
