Multifidelity linear regression for scientific machine learning from scarce data
Elizabeth Qian, Dayoung Kang, Vignesh Sella, Anirban Chaudhuri

TL;DR
This paper introduces a multifidelity linear regression method for scientific machine learning that effectively leverages data of varying fidelities to reduce the need for expensive high-fidelity data, improving robustness and accuracy.
Contribution
It proposes a novel multifidelity training approach using linear regression and control variates, with theoretical bias and variance analysis for improved robustness with scarce high-fidelity data.
Findings
Achieves similar accuracy to high-fidelity only methods with much less high-fidelity data
Provides bias and variance guarantees for the estimators
Demonstrates effectiveness through numerical experiments
Abstract
Machine learning (ML) methods, which fit to data the parameters of a given parameterized model class, have garnered significant interest as potential methods for learning surrogate models for complex engineering systems for which traditional simulation is expensive. However, in many scientific and engineering settings, generating high-fidelity data on which to train ML models is expensive, and the available budget for generating training data is limited, so that high-fidelity training data are scarce. ML models trained on scarce data have high variance, resulting in poor expected generalization performance. We propose a new multifidelity training approach for scientific machine learning via linear regression that exploits the scientific context where data of varying fidelities and costs are available: for example, high-fidelity data may be generated by an expensive fully resolved…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification
MethodsLinear Regression
