Neural Network Training Using $\ell_1$-Regularization and Bi-fidelity   Data

Subhayan De; Alireza Doostan

arXiv:2105.13011·stat.ML·April 6, 2022

Neural Network Training Using $\ell_1$-Regularization and Bi-fidelity Data

Subhayan De, Alireza Doostan

PDF

TL;DR

This paper introduces bi-fidelity $ abla$-regularization strategies for neural network training that leverage low-fidelity data to improve accuracy when high-fidelity data is scarce, outperforming standard methods.

Contribution

The paper proposes novel bi-fidelity $ abla$-regularization methods that incorporate low-fidelity model information into neural network training for small high-fidelity datasets.

Findings

01

Bi-fidelity strategies reduce errors by an order of magnitude compared to high-fidelity only training.

02

Bi-fidelity methods outperform standard $ abla$-regularization in uncertainty propagation tasks.

03

The approach effectively leverages low-fidelity data to enhance neural network surrogate models.

Abstract

With the capability of accurately representing a functional relationship between the inputs of a physical system's model and output quantities of interest, neural networks have become popular for surrogate modeling in scientific applications. However, as these networks are over-parameterized, their training often requires a large amount of data. To prevent overfitting and improve generalization error, regularization based on, e.g., $ℓ_{1}$ - and $ℓ_{2}$ -norms of the parameters is applied. Similarly, multiple connections of the network may be pruned to increase sparsity in the network parameters. In this paper, we explore the effects of sparsity promoting $ℓ_{1}$ -regularization on training neural networks when only a small training dataset from a high-fidelity model is available. As opposed to standard $ℓ_{1}$ -regularization that is known to be inadequate, we consider two variants of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.