Data-Centric Engineering: integrating simulation, machine learning and statistics. Challenges and Opportunities
Indranil Pan, Lachlan Mason, Omar Matar

TL;DR
This paper reviews the emerging field of data-centric engineering that combines simulation, machine learning, and statistics to enhance physical modeling, discussing current trends, opportunities, challenges, and future workforce needs.
Contribution
It provides a comprehensive overview of the integration of simulations, machine learning, and statistics in engineering, highlighting key research directions and challenges.
Findings
Hybrid approaches leverage physical models and data-driven methods.
Integration unlocks new opportunities in physical sciences and engineering.
Identifies bottlenecks and workforce upskilling needs.
Abstract
Recent advances in machine learning, coupled with low-cost computation, availability of cheap streaming sensors, data storage and cloud technologies, has led to widespread multi-disciplinary research activity with significant interest and investment from commercial stakeholders. Mechanistic models, based on physical equations, and purely data-driven statistical approaches represent two ends of the modelling spectrum. New hybrid, data-centric engineering approaches, leveraging the best of both worlds and integrating both simulations and data, are emerging as a powerful tool with a transformative impact on the physical disciplines. We review the key research trends and application scenarios in the emerging field of integrating simulations, machine learning, and statistics. We highlight the opportunities that such an integrated vision can unlock and outline the key challenges holding back…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Gaussian Processes and Bayesian Inference · Machine Learning and Data Classification
