Active learning for structural reliability analysis with multiple limit state functions through variance-enhanced PC-Kriging surrogate models
J. Moran A., P.G. Morato, P. Rigo

TL;DR
This paper develops an active learning method using PC-Kriging surrogate models with variance correction to efficiently estimate multiple structural reliability limit states, balancing accuracy and computational cost.
Contribution
It introduces a novel active learning scheme for multiple limit state functions using variance-enhanced PC-Kriging, applicable to complex nonlinear structural reliability problems.
Findings
Effective in predicting failure and repair events after ship collision
Balances computational resources while maintaining accuracy for multiple limit states
Validated on practical offshore wind substructure scenario
Abstract
Existing active strategies for training surrogate models yield accurate structural reliability estimates by aiming at design space regions in the vicinity of a specified limit state function. In many practical engineering applications, various damage conditions, e.g. repair, failure, should be probabilistically characterized, thus demanding the estimation of multiple performance functions. In this work, we investigate the capability of active learning approaches for efficiently selecting training samples under a limited computational budget while still preserving the accuracy associated with multiple surrogated limit states. Specifically, PC-Kriging-based surrogate models are actively trained considering a variance correction derived from leave-one-out cross-validation error information, whereas the sequential learning scheme relies on U-function-derived metrics. The proposed active…
| Strategy | Learning metric | |||
|---|---|---|---|---|
| U(PCK) | ||||
| U(PCK) | ||||
| U(PCK) | ||||
| U(PCK) | ||||
| U(PCK-LOO) | ||||
| U(PCK-LOO) | ||||
| U(PCK-LOO) | ||||
| U(PCK-LOO) |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProbabilistic and Robust Engineering Design · Water Systems and Optimization · Structural Integrity and Reliability Analysis
MethodsRepair
Active learning for structural reliability analysis with multiple limit state functions through variance-enhanced PC-Kriging surrogate models
J. Morán A
P.G. Morato
P. Rigo
ANAST, Department of ArGEnCo, University of Liege, 4000, Liege, Belgium
Department of Wind and Energy Systems, Technical University of Denmark, 4000 Roskilde, Denmark
Abstract
Existing active strategies for training surrogate models yield accurate structural reliability estimates by aiming at design space regions in the vicinity of a specified limit state function. In many practical engineering applications, various damage conditions, e.g. repair, failure, should be probabilistically characterized, thus demanding the estimation of multiple performance functions. In this work, we investigate the capability of active learning approaches for efficiently selecting training samples under a limited computational budget while still preserving the accuracy associated with multiple surrogated limit states. Specifically, PC-Kriging-based surrogate models are actively trained considering a variance correction derived from leave-one-out cross-validation error information, whereas the sequential learning scheme relies on U-function-derived metrics. The proposed active learning approaches are tested in a highly nonlinear structural reliability setting, whereas in a more practical application, failure and repair events are stochastically predicted in the aftermath of a ship collision against an offshore wind substructure. The results show that a balanced computational budget administration can be effectively achieved by successively targeting the specified multiple limit state functions within a unified active learning scheme.
keywords:
Active learning; Gaussian Processes; Structural reliability; Surrogate modeling; Offshore wind turbines; Polynomial chaos expansion.
††journal: ICASP14
1 INTRODUCTION
An optimized design and management of engineering systems from a life-cycle perspective aims at jointly minimizing maintenance costs and structural failure risk, quantified via economic and structural reliability metrics [1]. To probabilistically characterize a failure event and/or a specific damage condition, one or multiple performance functions can be accordingly formulated, thereby accounting for the uncertainty associated with system response predictions and informing asset management decisions. The quantification of uncertainties associated with specifically defined failure and/or damage events normally demands the computation of several high-fidelity engineering simulations, e.g., finite element analysis, computational fluid dynamics.
Considering that high-fidelity simulations can be time-consuming and computationally expensive, surrogate models offer an attractive solution by providing a light-running approximation of the model response based on a reduced number of data points. In general, surrogate (meta)models are able to efficiently learn the mathematical relationship between uncertain input design variables and a relevant output quantity of interest (QoI). Among various proposed surrogate models, Gaussian Process-based models (GP), e.g., Kriging and PC-Kriging, have demonstrated their effectiveness in supporting a wide range of engineering applications [2]. Leveraging on their probabilistic formulation, GP-based surrogate models yield not only a point estimate of the QoI but also an uncertainty measure associated with the generated prediction. In many applications, surrogate models seek the minimization of the system response generalization error with the least number of high-fidelity model evaluations, often inducing a global exploration of the design space. When dealing with structural reliability applications, the relevant limit state(s) can be directly surrogated, and the experimental design logically focuses on regions near the boundary in order to accurately estimate the probability associated with a specific event, e.g., failure, damage condition.
With the goal of efficiently improving the accuracy of a surrogated limit state function, active learning is a machine learning technique that sequentially selects the training samples based on a specified learning metric. This approach is particularly useful when the computational cost of retrieving new high-fidelity model evaluations is high and/or in settings under a constrained computational/economic budget. From the original development of the Expected Feasibility Function (EFF) [3] and the U-function [4], sophisticated active learning schemes have been widely proposed in the literature, in which the training samples are collected based on an exploitation-exploration trade-off, i.e., exploiting observations near the limit state or exploring yet uncertain ones. As suggested by [5], active learning approaches can be generally categorized according to four distinctive features: (i) surrogate model choice, (ii) failure probability estimation technique, (iii) learning enhancement metric, and (iv) stopping criteria.
Active learning methods have been proven to be effective for a myriad of structural reliability problems [5]. The computational budget, however, is there mainly dedicated to efficiently capturing a limit state and its associated failure event. In many engineering applications (e.g., inspection and maintenance planning), additional events rather than structural failure may be relevant, hence potentially demanding the evaluation of multiple limit states that inform, for instance, operational decisions. In this work, we investigate the capability of active learning approaches to identify training samples when dealing with multiple interrelated limit states under a common computational budget, in which, an observation can be informative for estimating either the event associated with only one or multiple events. Specifically, PC-Kriging-based surrogate models are actively trained while additionally accounting for a variance correction derived from leave-one-out cross validation error information, whereas the learning scheme relies on sequentially Monte Carlo samples evaluated according to a U-function.
In order to effectively balance the generation of training points around multiple limit states within a common budget, we additionally propose here active learning strategies that sequentially select observations with the objective of jointly improving the accuracy of all treated limit states. Besides the traditional exploitation-exploration trade-off, training samples are sequentially picked by balancing the predicted accuracy among the surrogated limit states. The proposed active training approaches are then tested in a characteristic multi-modal structural reliability setting, examining their accuracy and training efficiency; and in a more practical application, we efficiently surrogate limit states associated with both failure and repair events corresponding to the aftermath of a ship collision against an offshore wind substructure.
2 LEARNING LIMIT STATE FUNCTIONS THROUGH SURROGATE MODELS
2.1 PC-Kriging surrogate models
Consider the response of a system represented by , as a one-dimensional output space which is retrieved from the deterministic mapping of the M-dimensional stochastic input parameter space, whose realizations are denoted as . Within this frame of reference, PC-Kriging (PCK) is a non-intrusive surrogate modeling method that combines the probabilistic features of Gaussian processes, also known as Kriging, with a prior trend specifically tailored to the underlying input random vector, , via Polynomial Chaos Expansion (PCE). While Kriging estimates the analyzed quantity of interest (QoI), , through a stationary spatial random process where the covariance function (kernel) is defined as a function of the relative distance between data points, PCE further incorporates prior knowledge to the Gaussian process mean, as the sum of orthogonal polynomials described by the joint probability density function . Formally, a PCK metamodel, , is defined as: [2]
[TABLE]
where the left-hand term corresponds to the mean of a Gaussian process, defined as the linear combination of coefficients, , and multivariate orthogonal polynomials, , specified according to the input random vector, . The terms of the polynomial sum are multi-indexed through . The constant variance, , is formulated on the right-hand of the equation, along with the zero mean, unit variance, stationary Gaussian process, . The latter is described by an autocorrelation function , where the covariance is usually defined based on the relative distance of its inputs , and is parameterized by the hyperparameter , commonly referred as scale parameter or characteristic length.
To calibrate a PCK model built from a set of polynomials truncated by the size of , one can rely on non-intrusive methods that are based on discrete system responses, gathered from a sampling plan of input realizations , also known as experimental design. The parameters and can be calculated through maximum-likelihood, i.e., maximizing the likelihood associated with model predictions, . Since and are defined as a function of , an optimization process for determining can be formulated as:
[TABLE]
In an iterative process, various multivariate orthogonal polynomials of increasing truncation order can be trained following the optimization process described before and further classified according to a specified error metric. For example, the optimal PCK model can be defined as the one that yields the minimum leave-one-out (LOO) cross-validated error, , formulated as:
[TABLE]
This error metric is an estimate of the generalization error based on the current experimental design, which is the mean squared error between the responses and the model predictions at of a PCK model, , calibrated with sample points.
2.2 Active learning for structural reliability analysis
Compared with conventional design of experiments, e.g., Latin hypercube sampling (LHS), more accurate predictions can be achieved with less high-fidelity model evaluations if an active training approach is followed, in which a surrogate model is sequentially trained based on the available observations up to that point. Each subsequent experimental design point is selected as the one that results in a maximum expected improvement in model accuracy according to a metric computed from a learning scoring function. Very often, this enhancement metric is mainly driven by the surrogate model’s built-in probabilistic features, e.g., variance prediction.
A well-known scoring function is the one proposed by [4], in which a metric denoted as ‘U’ is evaluated for a set of randomly generated samples. A tailored and efficient exploration of the design space can be additionally accomplished if the samples are generated via simulation-based methods, e.g., Monte Carlo, where the evaluated points are directly sampled from the underlying input random variables. More specifically, the above-mentioned U-function focuses on the design subspace near the limit state boundary considering an exploitation-exploration trade-off, as mentioned in Section 1. Mathematically, each subsequent experimental design point, , is selected as:
[TABLE]
where and represent the mean and standard deviation, respectively, predicted via a surrogate model. The expected improvement metric favors design points close to the limit state boundary and high associated predicted variance. In order to calculate more accurate expected improvement scores, the variance estimate provided by a PCK model for a specific design point can be corrected [6].
The selection of subsequent experimental design points can thus be further improved by adequately correcting the generated variance predictions, , which might be potentially biased [7]. To do that, a correction factor can be calculated from a leave-one-out (LOO) cross-validated analysis, which is then applied to the variance predictions associated with the experimental design, , within the Voronoi domain :
[TABLE]
where denotes the vector of LOO squared errors, the vector of LOO variances and the domain of the Voronoi cells .
2.3 Active learning strategies for multiple limit state functions settings
Active learning strategies developed for reliability analysis sequentially select experimental design points with the objective of more accurately representing a specific limit state function. As mentioned in Section 1, multiple limit state functions defined within a common input design domain can be of interest in many practical applications, e.g., informing operational decisions or estimating the probability associated with a structural failure event. If the experimental points are specific to a particular limit state function, , predictions generated for another limit state function, might not be necessarily accurate.
To equally balance a certain given computational budget, a logical active training scheme could, for instance, alternate its focus between considered limit state functions, improving one at each consecutive step. If the experimental design points, , are scored via the U-function (Eq. 4), the training sequence will subsequently evaluate the U-function defined with respect to an alternate limit state function from all considered ones, i.e., with .
In practice, some limit state functions are substantially easier to train, hence rendering accurate predictions when trained with only a few observations. By implementing the alternate active learning strategy, , the computational budget is equally allocated for all limit state functions, even if a certain limit state is accurately predicted by the surrogate model early in the sequential process. Instead, one can detect which surrogated limit state function is still far from reaching convergence and select the subsequent experimental design point by evaluating a learning scoring function with respect to the identified limit state. At each active learning step, , a convergence-related metric inspired on the reliability index is here proposed for selecting the target limit state function:
[TABLE]
At each training step, the target limit state function, , becomes the one associated with the maximum , and hence the experimental design point, , is accordingly chosen from a learning metric (e.g., U-function) estimated according to the identified limit state function. Note that the index follows an inverse relationship with the probability associated with the event of interest, p, commonly defined in the standard normal space as , where is computed as the probability of the limit state being negative, i.e., .
3 NUMERICAL EXPERIMENTS
3.1 Analytical reliability problem
Inspired by [6], the first limit state, , studied here is defined as:
[TABLE]
For the sake of capturing the active training performance over a second limit state function, an additional performance function is proposed and formulated as:
[TABLE]
with random variables, and , described as and , respectively. The tested active learning strategies are restricted to a computational budget of 49 ground-truth observations. From the available budget, 10 training points are dedicated to an initial global exploration following a uniform stratification of samples through LHS, committing the remaining resources to the active training of the surrogate model. From the stationary correlation families, the general form of Matérn kernel of degree 5/2 is specified as the autocorrelation function .
At each active learning step, the subsequent experimental design point is selected as the sample that minimizes the U-function over a population of randomly generated Monte Carlo samples. Also relying on training samples chosen according to U-function scoring metrics, active learning strategies that consider corrected variance predictions (Eq. 5) are additionally investigated.
The results are reported in terms of relative error between the predicted limit state function index, , and the ground truth, , which is known in this setting. Formally, the error estimator , associated with a target limit state function , is calculated for each conducted experiment , as:
[TABLE]
In total, 15 experiments are executed for each tested active learning strategy considering both U and U-LOO scoring function metrics: (i) training points selected with respect to target limit state function , i.e., , (ii) training points that specifically target limit state function , i.e., , (iii) alternate sequential active learning approach, i.e., , and (iv) training points selected based on the convergence criterion stated in Eq. 6, i.e., . In order to objectively evaluate the error accumulated from both considered limit state functions, an error metric is formulated as: , providing a proxy for assessing the joint performance.
Results and discussion
Figure 1 showcases the evolution of the surrogated event probabilities, and , estimated from all tested active training strategies. In particular, the resulting expected probability, bounded by 30-60% percentiles, is represented over 49 training samples. With the purpose of interpreting the influence of the predicted variance on the learning evolution, strategies resulting from U-based learning scores computed from both PCK’s variance (U) and corrected variance (U-LOO) are additionally compared.
As seen in the figure, PCK models trained with respect to a specific target limit state result inaccurate when the surrogate model is applied for the estimation of the other considered limit state function. Particularly, strategies and render inaccurate estimations and significant training variance across experiments for and , respectively. In contrast, strategies that sequentially target both limit state functions within the available computational budget are able to yield a more balanced result in terms of accuracy and training stability. The straightforward alternate strategy, , for instance, concurrently reduces the inaccuracy gap resulting from both surrogated limit state functions over the training process. By implementing the strategy based on a converge-related metric, , more accurate predictions can be achieved for both limit state functions. This can be attributed to the fact that, at each learning step, the selected training point seeks the improvement of the most inaccurate surrogated limit state function.
The performance reached at the end of the training process is additionally described in Table 1, listing the expected relative error metric and its corresponding standard deviation for each considered limit state function, , together with the total relative error, . For each listed error metric, the best strategy is highlighted. Evidently, the strategies and yield accurate predictions for their corresponding target limit state function, yet high relative errors are then observed when they are applied to the other limit state function, thus ultimately resulting in a global high error, , compared to the strategies and . When considering U-LOO-based learning score metrics, surprisingly becomes the best strategy for accurately representing the second limit state over all tested experiments, yet with a higher variability than .
To further inspect the spread of the reported relative error metric over experiments, a box plot is shown in Figure 2, delimiting the interquartile range, i.e., between 25% and 75% quantiles, and featuring a whisker that extends over 2.5% and 97.5%. Additionally, the mean and median values are indicated with solid orange and dashed red lines, respectively. In the figure, one can clearly observe that the strategy based on the converge-related metric, , yields a lower mean and median relative error in both U and U-LOO settings compared to its counterparts. While rendering accurate predictions for both limit state functions, the alternate active learning strategy, , logically results in higher variability over experiments compared to , as the target limit state function is not there intentionally assigned. With respect to the learning metric, the reported results also show that correcting the variance retrieved from the surrogate model leads to more accurate learning strategies.
3.2 Offshore wind substructure subject to ship collision accidental events
In this second numerical experiment, we test the proposed active learning approaches for quantifying the probability associated with a repair and a failure event in the aftermath of an offshore wind substructure subject to ship collisions. All computations are conducted on an Intel Core processor with a clock speed of 3.50 . The assumed ground truth corresponds, in this setting, to the substructure penetration, , computed from a simplified numerical model [8] based on limit plastic analysis. Since each numerical simulation requires approximately 3 minutes of computational time, a non-surrogated estimation of the failure and repair probabilities is computationally unfeasible.
Both considered limit state functions are defined according to the resulting maximum substructure penetration, . A failure occurs if the maximum penetration exceeds a critical value, :
[TABLE]
whereas a repair event is stated as a function of a specified damage condition, , as:
[TABLE]
where and are specified as 3 and 2 meters, respectively. The random variables governing a collision scenario are the initial velocity of the ship, , and the material flow stress, , assumed here as elastic-perfectly plastic. These variables are probabilistically described as m/s and N/mm2, respectively.
Results and discussion
The active learning evolution is showcased in Figure 3, representing both failure and damage probabilities over 10 experiments. Note that, in this case, only the corrected variance active learning scheme (U-LOO) is investigated. Since a closed-form solution of the analyzed failure and damage events is not available, all tested strategies are examined in terms of convergence. One can observe that with only a few training samples, most strategies easily reach convergence, yet active learning schemes that only target one specific limit state function (i.e., and ), are evidently more unstable when applied to the other considered limit state. Instead, active learning strategies that interchangeably target all considered limit state functions converge to both estimated failure and damage event probabilities, all trained under the same computational budget.
4 CONCLUSIONS
Surrogate-based active learning strategies for reliability analysis effectively yield accurate predictions for a specifically targeted limit state function, yet they may render inaccurate estimates for other regions within the design space. This paper reveals that, by sequentially targeting multiple limit state functions throughout the training process, combined active learning strategies are able to achieve a balanced computational budget allocation, resulting in low overall prediction errors.
Acknowledgement
Mr. Morán gratefully acknowledges the support received by the National Fund for Scientific Research in Belgium F.R.I.A. - F.N.R.S.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] P. G. Morato, K. G. Papakonstantinou, C. P. Andriotis, J. S. Nielsen, P. Rigo, Optimal inspection and maintenance planning for deteriorating structural components through dynamic Bayesian networks and Markov decision processes , Structural Safety 94 (2022) 102140. doi:https://doi.org/10.1016/j.strusafe.2021.102140 . · doi ↗
- 2[2] R. Schobi, B. Sudret, J. Wiart, POLYNOMIAL-CHAOS-BASED KRIGING, International Journal for Uncertainty Quantification 5 (2) (2015) 171–193.
- 3[3] B. J. Bichon, M. S. Eldred, L. P. Swiler, S. Mahadevan, J. M. Mc Farland, Efficient global reliability analysis for nonlinear implicit performance functions , AIAA Journal 46 (10) (2008) 2459–2468. ar Xiv:https://doi.org/10.2514/1.34321 , doi:10.2514/1.34321 . · doi ↗
- 4[4] B. Echard, N. Gayton, M. Lemaire, AK-MCS: An active learning reliability method combining Kriging and Monte Carlo Simulation , Structural Safety 33 (2) (2011) 145–154. doi:https://doi.org/10.1016/j.strusafe.2011.01.002 . · doi ↗
- 5[5] M. Moustapha, S. Marelli, B. Sudret, Active learning for structural reliability: Survey, general framework and benchmark , Structural Safety 96 (2022) 102174. doi:https://doi.org/10.1016/j.strusafe.2021.102174 . · doi ↗
- 6[6] N.-C. Xiao, M. J. Zuo, W. Guo, Efficient reliability analysis based on adaptive sequential sampling design and cross-validation , Applied Mathematical Modelling 58 (2018) 404–420. doi:https://doi.org/10.1016/j.apm.2018.02.012 . · doi ↗
- 7[7] L. Le Gratiet, C. Cannamela, Cokriging-based sequential design strategies using fast cross-validation techniques for multi-fidelity computer codes, Technometrics 57 (3) (2015) 418–427.
- 8[8] T. Pire, Crashworthiness of offshore wind turbine jackets based on the continuous element method, Ph D dissertation, University of Liège, Belgium (2018).
