Multifidelity linear regression for scientific machine learning from   scarce data

Elizabeth Qian; Dayoung Kang; Vignesh Sella; Anirban Chaudhuri

arXiv:2403.08627·stat.ML·July 3, 2024·1 cites

Multifidelity linear regression for scientific machine learning from scarce data

Elizabeth Qian, Dayoung Kang, Vignesh Sella, Anirban Chaudhuri

PDF

Open Access

TL;DR

This paper introduces a multifidelity linear regression method for scientific machine learning that effectively leverages data of varying fidelities to reduce the need for expensive high-fidelity data, improving robustness and accuracy.

Contribution

It proposes a novel multifidelity training approach using linear regression and control variates, with theoretical bias and variance analysis for improved robustness with scarce high-fidelity data.

Findings

01

Achieves similar accuracy to high-fidelity only methods with much less high-fidelity data

02

Provides bias and variance guarantees for the estimators

03

Demonstrates effectiveness through numerical experiments

Abstract

Machine learning (ML) methods, which fit to data the parameters of a given parameterized model class, have garnered significant interest as potential methods for learning surrogate models for complex engineering systems for which traditional simulation is expensive. However, in many scientific and engineering settings, generating high-fidelity data on which to train ML models is expensive, and the available budget for generating training data is limited, so that high-fidelity training data are scarce. ML models trained on scarce data have high variance, resulting in poor expected generalization performance. We propose a new multifidelity training approach for scientific machine learning via linear regression that exploits the scientific context where data of varying fidelities and costs are available: for example, high-fidelity data may be generated by an expensive fully resolved…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification

MethodsLinear Regression