Polynomial Regression Surrogate
This example is meant to demonstrate how a polynomial regression based surrogate model is trained and used on a parametric problem. Additionally, the results are compared to those obtained using a Polynomial Chaos (PC) surrogate. The possible differences in applicability are highlighted as well. For more on the regression method used here, see PolynomialRegressionTrainer while details of Polynomial Chaos are available under PolynomialChaos.
Problem Statement
The full-order model in this example is essentially the same as the one described in Training a Surrogate Model. It is a one-dimensional heat conduction model:
where is the temperature, is the thermal conductivity, is the length of the domain, is a heat source and is the value of the Dirichlet boundary condition. To make the comparison between different surrogate models easier, only the maximum temperature is selected to be the Quantity of Interest (QoI):
(1)The problem is parametric in a sense that the solution depends on four input parameters: . Two problem settings are considered in this example. In the first scenario, all of the parameters are assumed to have Uniform distributions (), while the second considers parameters with Normal distributions (). To be more specific the distributions for the two cases are:
Parameter | Symbol | Uniform | Normal |
---|---|---|---|
Conductivity | |||
Volumetric Heat Source | |||
Domain Size | |||
Right Boundary Temperature |
The parameters of the uniform distribution are the minimum and maximum bounds, while the parameters of the normal distribution are the mean and standard deviation. It must be mentioned that the maximum temperature can be determined analytically and turns out to be:
Using this expression and the previously described probability density functions, the mean () and standard deviation () of the QoI can be computed for reference:
Table 1: The reference results for the mean and standard deviation of the maximum temperature.
Moment | Uniform | Normal |
---|---|---|
301.3219 | 301.2547 | |
5.9585 | 10.0011 |
Solving the problem without uncertain parameters
The first step towards creating a surrogate model is the generation of a full-order model which can solve Eq. (1) with fixed parameter combinations. The complete input file for this case is presented in Listing 1.
Listing 1: Complete input file for the heat equation problem in this study.
[Mesh<<<{"href": "../../../syntax/Mesh/index.html"}>>>]
type = GeneratedMesh
dim = 1
nx = 100
xmax = 1
elem_type = EDGE3
[]
[Variables<<<{"href": "../../../syntax/Variables/index.html"}>>>]
[T]
order<<<{"description": "Specifies the order of the FE shape function to use for this variable (additional orders not listed are allowed)"}>>> = SECOND
family<<<{"description": "Specifies the family of FE shape functions to use for this variable"}>>> = LAGRANGE
[]
[]
[Kernels<<<{"href": "../../../syntax/Kernels/index.html"}>>>]
[diffusion]
type = MatDiffusion<<<{"description": "Diffusion equation Kernel that takes an isotropic Diffusivity from a material property", "href": "../../../source/kernels/MatDiffusion.html"}>>>
variable<<<{"description": "The name of the variable that this residual object operates on"}>>> = T
diffusivity<<<{"description": "The diffusivity value or material property"}>>> = k
[]
[source]
type = BodyForce<<<{"description": "Demonstrates the multiple ways that scalar values can be introduced into kernels, e.g. (controllable) constants, functions, and postprocessors. Implements the weak form $(\\psi_i, -f)$.", "href": "../../../source/kernels/BodyForce.html"}>>>
variable<<<{"description": "The name of the variable that this residual object operates on"}>>> = T
value<<<{"description": "Coefficient to multiply by the body force term"}>>> = 1.0
[]
[]
[Materials<<<{"href": "../../../syntax/Materials/index.html"}>>>]
[conductivity]
type = GenericConstantMaterial<<<{"description": "Declares material properties based on names and values prescribed by input parameters.", "href": "../../../source/materials/GenericConstantMaterial.html"}>>>
prop_names<<<{"description": "The names of the properties this material will have"}>>> = k
prop_values<<<{"description": "The values associated with the named properties"}>>> = 2.0
[]
[]
[BCs<<<{"href": "../../../syntax/BCs/index.html"}>>>]
[right]
type = DirichletBC<<<{"description": "Imposes the essential boundary condition $u=g$, where $g$ is a constant, controllable value.", "href": "../../../source/bcs/DirichletBC.html"}>>>
variable<<<{"description": "The name of the variable that this residual object operates on"}>>> = T
boundary<<<{"description": "The list of boundary IDs from the mesh where this object applies"}>>> = right
value<<<{"description": "Value of the BC"}>>> = 300
[]
[]
[Executioner<<<{"href": "../../../syntax/Executioner/index.html"}>>>]
type = Steady
solve_type = PJFNK
petsc_options_iname = '-pc_type -pc_hypre_type'
petsc_options_value = 'hypre boomeramg'
[]
[Postprocessors<<<{"href": "../../../syntax/Postprocessors/index.html"}>>>]
[max]
type = NodalExtremeValue<<<{"description": "Finds either the min or max elemental value of a variable over the domain.", "href": "../../../source/postprocessors/NodalExtremeValue.html"}>>>
variable<<<{"description": "The name of the variable that this postprocessor operates on"}>>> = T
value_type<<<{"description": "Type of extreme value to return. 'max' returns the maximum value. 'min' returns the minimum value. 'max_abs' returns the maximum of the absolute value."}>>> = max
[]
[]
[Outputs<<<{"href": "../../../syntax/Mastodon/Outputs/index.html"}>>>]
[]
(moose/modules/stochastic_tools/examples/surrogates/polynomial_regression/sub.i)Training surrogate models
Both surrogate models are constructed using some knowledge about the full-order problem. This means that the full-order problem is solved multiple times with different parameter samples and the value of the QoI is stored from each computation. This step is managed by a master input file which creates parameter samples, transfers them to the sub-application and collects the results from the completed computations. For more information about setting up master input files see Training a Surrogate Model and Parameter Study. The two complete training input files used for the two cases with the two different parameter distributions are available under uniform and normal.
The training phase starts with the definition of the distributions in the Distributions
block. The uniform distributions can be defined as:
[Distributions<<<{"href": "../../../syntax/Distributions/index.html"}>>>]
[k_dist]
type = Uniform<<<{"description": "Continuous uniform distribution.", "href": "../../../source/distributions/Uniform.html"}>>>
lower_bound<<<{"description": "Distribution lower bound"}>>> = 1
upper_bound<<<{"description": "Distribution upper bound"}>>> = 10
[]
[q_dist]
type = Uniform<<<{"description": "Continuous uniform distribution.", "href": "../../../source/distributions/Uniform.html"}>>>
lower_bound<<<{"description": "Distribution lower bound"}>>> = 9000
upper_bound<<<{"description": "Distribution upper bound"}>>> = 11000
[]
[L_dist]
type = Uniform<<<{"description": "Continuous uniform distribution.", "href": "../../../source/distributions/Uniform.html"}>>>
lower_bound<<<{"description": "Distribution lower bound"}>>> = 0.01
upper_bound<<<{"description": "Distribution upper bound"}>>> = 0.05
[]
[Tinf_dist]
type = Uniform<<<{"description": "Continuous uniform distribution.", "href": "../../../source/distributions/Uniform.html"}>>>
lower_bound<<<{"description": "Distribution lower bound"}>>> = 290
upper_bound<<<{"description": "Distribution upper bound"}>>> = 310
[]
[]
(moose/modules/stochastic_tools/examples/surrogates/polynomial_regression/uniform_train.i)For the case with normal distributions the block changes to:
[Distributions<<<{"href": "../../../syntax/Distributions/index.html"}>>>]
[k_dist]
type = Normal<<<{"description": "Normal distribution", "href": "../../../source/distributions/Normal.html"}>>>
mean<<<{"description": "Mean (or expectation) of the distribution."}>>> = 5
standard_deviation<<<{"description": "Standard deviation of the distribution "}>>> = 2
[]
[q_dist]
type = Normal<<<{"description": "Normal distribution", "href": "../../../source/distributions/Normal.html"}>>>
mean<<<{"description": "Mean (or expectation) of the distribution."}>>> = 10000
standard_deviation<<<{"description": "Standard deviation of the distribution "}>>> = 500
[]
[L_dist]
type = Normal<<<{"description": "Normal distribution", "href": "../../../source/distributions/Normal.html"}>>>
mean<<<{"description": "Mean (or expectation) of the distribution."}>>> = 0.03
standard_deviation<<<{"description": "Standard deviation of the distribution "}>>> = 0.01
[]
[Tinf_dist]
type = Normal<<<{"description": "Normal distribution", "href": "../../../source/distributions/Normal.html"}>>>
mean<<<{"description": "Mean (or expectation) of the distribution."}>>> = 300
standard_deviation<<<{"description": "Standard deviation of the distribution "}>>> = 10
[]
[]
(moose/modules/stochastic_tools/examples/surrogates/polynomial_regression/normal_train.i)As a next step, several parameter instances are prepared by sampling the underlying distributions. The sampling objects can be defined in the Samplers
block. The generation of these parameter samples is different for the two surrogate models. Meanwhile the polynomial chaos uses the samples at specific quadrature points in the parameters space (generated by a QuadratureSampler), the polynomial regression model is trained using samples from a LatinHypercube. It is visible that the number of sample (num_rows
) is set in the LatinHypercube to match the number of samples in the tensor-product quadrature set of QuadratureSampler.
[Samplers<<<{"href": "../../../syntax/Samplers/index.html"}>>>]
[pc_sampler]
type = Quadrature<<<{"description": "Quadrature sampler for Polynomial Chaos.", "href": "../../../source/samplers/QuadratureSampler.html"}>>>
order<<<{"description": "Specify the maximum order of the polynomials in the expansion."}>>> = 8
distributions<<<{"description": "The distribution names to be sampled, the number of distributions provided defines the number of columns per matrix and their type defines the quadrature."}>>> = 'k_dist q_dist L_dist Tinf_dist'
execute_on<<<{"description": "The list of flag(s) indicating when this object should be executed. For a description of each flag, see https://mooseframework.inl.gov/source/interfaces/SetupInterface.html."}>>> = PRE_MULTIAPP_SETUP
[]
[pr_sampler]
type = LatinHypercube<<<{"description": "Latin Hypercube Sampler.", "href": "../../../source/samplers/LatinHypercubeSampler.html"}>>>
distributions<<<{"description": "The distribution names to be sampled, the number of distributions provided defines the number of columns per matrix."}>>> = 'k_dist q_dist L_dist Tinf_dist'
num_rows<<<{"description": "The size of the square matrix to generate."}>>> = 6560
execute_on<<<{"description": "The list of flag(s) indicating when this object should be executed. For a description of each flag, see https://mooseframework.inl.gov/source/interfaces/SetupInterface.html."}>>> = PRE_MULTIAPP_SETUP
[]
[]
(moose/modules/stochastic_tools/examples/surrogates/polynomial_regression/normal_train.i)The objects in blocks Controls
, MultiApps
, Transfers
and Reporters
are responsible for managing the communication between master and sub-applications, execution of the sub-applications and the collection of the results. For a more detailed description of these blocks see Parameter Study and Training a Surrogate Model.
[MultiApps<<<{"href": "../../../syntax/MultiApps/index.html"}>>>]
[pc_sub]
type = SamplerFullSolveMultiApp<<<{"description": "Creates a full-solve type sub-application for each row of each Sampler matrix.", "href": "../../../source/multiapps/SamplerFullSolveMultiApp.html"}>>>
input_files<<<{"description": "The input file for each App. If this parameter only contains one input file it will be used for all of the Apps. When using 'positions_from_file' it is also admissable to provide one input_file per file."}>>> = sub.i
sampler<<<{"description": "The Sampler object to utilize for creating MultiApps."}>>> = pc_sampler
[]
[pr_sub]
type = SamplerFullSolveMultiApp<<<{"description": "Creates a full-solve type sub-application for each row of each Sampler matrix.", "href": "../../../source/multiapps/SamplerFullSolveMultiApp.html"}>>>
input_files<<<{"description": "The input file for each App. If this parameter only contains one input file it will be used for all of the Apps. When using 'positions_from_file' it is also admissable to provide one input_file per file."}>>> = sub.i
sampler<<<{"description": "The Sampler object to utilize for creating MultiApps."}>>> = pr_sampler
[]
[]
[Controls<<<{"href": "../../../syntax/Controls/index.html"}>>>]
[pc_cmdline]
type = MultiAppSamplerControl<<<{"description": "Control for modifying the command line arguments of MultiApps.", "href": "../../../source/controls/MultiAppSamplerControl.html"}>>>
multi_app<<<{"description": "The name of the MultiApp to control."}>>> = pc_sub
sampler<<<{"description": "The Sampler object to utilize for altering the command line options of the MultiApp."}>>> = pc_sampler
param_names<<<{"description": "The names of the command line parameters to set via the sampled data."}>>> = 'Materials/conductivity/prop_values Kernels/source/value Mesh/xmax BCs/right/value'
[]
[pr_cmdline]
type = MultiAppSamplerControl<<<{"description": "Control for modifying the command line arguments of MultiApps.", "href": "../../../source/controls/MultiAppSamplerControl.html"}>>>
multi_app<<<{"description": "The name of the MultiApp to control."}>>> = pr_sub
sampler<<<{"description": "The Sampler object to utilize for altering the command line options of the MultiApp."}>>> = pr_sampler
param_names<<<{"description": "The names of the command line parameters to set via the sampled data."}>>> = 'Materials/conductivity/prop_values Kernels/source/value Mesh/xmax BCs/right/value'
[]
[]
[Transfers<<<{"href": "../../../syntax/Transfers/index.html"}>>>]
[pc_data]
type = SamplerReporterTransfer<<<{"description": "Transfers data from Reporters on the sub-application to a StochasticReporter on the main application.", "href": "../../../source/transfers/SamplerReporterTransfer.html"}>>>
from_multi_app<<<{"description": "The name of the MultiApp to receive data from"}>>> = pc_sub
sampler<<<{"description": "A the Sampler object that Transfer is associated.."}>>> = pc_sampler
stochastic_reporter<<<{"description": "The name of the StochasticReporter object to transfer values to."}>>> = results
from_reporter<<<{"description": "The name(s) of the Reporter(s) on the sub-app to transfer from."}>>> = 'max/value'
[]
[pr_data]
type = SamplerReporterTransfer<<<{"description": "Transfers data from Reporters on the sub-application to a StochasticReporter on the main application.", "href": "../../../source/transfers/SamplerReporterTransfer.html"}>>>
from_multi_app<<<{"description": "The name of the MultiApp to receive data from"}>>> = pr_sub
sampler<<<{"description": "A the Sampler object that Transfer is associated.."}>>> = pr_sampler
stochastic_reporter<<<{"description": "The name of the StochasticReporter object to transfer values to."}>>> = results
from_reporter<<<{"description": "The name(s) of the Reporter(s) on the sub-app to transfer from."}>>> = 'max/value'
[]
[]
[Reporters<<<{"href": "../../../syntax/Reporters/index.html"}>>>]
[results]
type = StochasticReporter<<<{"description": "Storage container for stochastic simulation results coming from Reporters.", "href": "../../../source/reporters/StochasticReporter.html"}>>>
[]
[]
(moose/modules/stochastic_tools/examples/surrogates/polynomial_regression/normal_train.i)The next step is to set up two Trainer
objects to generate the surrogate models from the available data. This can be done in the Trainers
block. It is visible that both examples use the data from Sampler
and Reporter
objects. A polynomial chaos surrogate of order 8 and a polynomial regression surrogate with a polynomial of degree at most 4 is used in this study. The PolynomialChaosTrainer also needs knowledge about the underlying parameter distributions to be able to select matching polynomials.
[Trainers<<<{"href": "../../../syntax/Trainers/index.html"}>>>]
[pc_max]
type = PolynomialChaosTrainer<<<{"description": "Computes and evaluates polynomial chaos surrogate model.", "href": "../../../source/trainers/PolynomialChaosTrainer.html"}>>>
execute_on<<<{"description": "The list of flag(s) indicating when this object should be executed. For a description of each flag, see https://mooseframework.inl.gov/source/interfaces/SetupInterface.html."}>>> = timestep_end
order<<<{"description": "Maximum polynomial order."}>>> = 8
distributions<<<{"description": "Names of the distributions samples were taken from."}>>> = 'k_dist q_dist L_dist Tinf_dist'
sampler<<<{"description": "Sampler used to create predictor and response data."}>>> = pc_sampler
response<<<{"description": "Reporter value of response results, can be vpp with <vpp_name>/<vector_name> or sampler column with 'sampler/col_<index>'."}>>> = results/pc_data:max:value
[]
[pr_max]
type = PolynomialRegressionTrainer<<<{"description": "Computes coefficients for polynomial regession model.", "href": "../../../source/trainers/PolynomialRegressionTrainer.html"}>>>
regression_type<<<{"description": "The type of regression to perform."}>>> = "ols"
execute_on<<<{"description": "The list of flag(s) indicating when this object should be executed. For a description of each flag, see https://mooseframework.inl.gov/source/interfaces/SetupInterface.html."}>>> = timestep_end
max_degree<<<{"description": "Maximum polynomial degree to use for the regression."}>>> = 4
sampler<<<{"description": "Sampler used to create predictor and response data."}>>> = pr_sampler
response<<<{"description": "Reporter value of response results, can be vpp with <vpp_name>/<vector_name> or sampler column with 'sampler/col_<index>'."}>>> = results/pr_data:max:value
[]
[]
(moose/modules/stochastic_tools/examples/surrogates/polynomial_regression/uniform_train.i)As a last step in the training process, the important parameters of the trained surrogates are saved into .rd
files. These files can be used to construct the surrogate models again without the need to carry out the training process from the beginning.
[Outputs<<<{"href": "../../../syntax/Mastodon/Outputs/index.html"}>>>]
[pc_out]
type = SurrogateTrainerOutput<<<{"description": "Output for trained surrogate model data.", "href": "../../../source/outputs/SurrogateTrainerOutput.html"}>>>
trainers<<<{"description": "A list of SurrogateTrainer objects to output."}>>> = 'pc_max'
execute_on<<<{"description": "The list of flag(s) indicating when this object should be executed. For a description of each flag, see https://mooseframework.inl.gov/source/interfaces/SetupInterface.html."}>>> = FINAL
[]
[pr_out]
type = SurrogateTrainerOutput<<<{"description": "Output for trained surrogate model data.", "href": "../../../source/outputs/SurrogateTrainerOutput.html"}>>>
trainers<<<{"description": "A list of SurrogateTrainer objects to output."}>>> = 'pr_max'
execute_on<<<{"description": "The list of flag(s) indicating when this object should be executed. For a description of each flag, see https://mooseframework.inl.gov/source/interfaces/SetupInterface.html."}>>> = FINAL
[]
[]
(moose/modules/stochastic_tools/examples/surrogates/polynomial_regression/normal_train.i)Evaluation of surrogate models
To evaluate surrogate models, a new master input file has to be created for uniform and normal parameter distributions. The input files contain testing distributions for the parameters defined in the Distributions
block. In this study, the training distributions are used for the testing of the surrogates as well. Both surrogate models are tested using the same parameter samples. These samples are selected using LatinHypercube defined in the Samplers
block. Since the surrogate models are orders of magnitude faster than the full-order model, samples are selected for testing (compared to used for training).
[Samplers<<<{"href": "../../../syntax/Samplers/index.html"}>>>]
[sample]
type = LatinHypercube<<<{"description": "Latin Hypercube Sampler.", "href": "../../../source/samplers/LatinHypercubeSampler.html"}>>>
num_rows<<<{"description": "The size of the square matrix to generate."}>>> = 100000
distributions<<<{"description": "The distribution names to be sampled, the number of distributions provided defines the number of columns per matrix."}>>> = 'k_dist q_dist L_dist Tinf_dist'
[]
[]
(moose/modules/stochastic_tools/examples/surrogates/polynomial_regression/normal_surr.i)As a next step, two object are created in the Surrogates
block for the two surrogate modeling techniques. Both of them are constructed using the information available within the corresponding .rd
files.
[Surrogates<<<{"href": "../../../syntax/Surrogates/index.html"}>>>]
[pc_max]
type = PolynomialChaos<<<{"description": "Computes and evaluates polynomial chaos surrogate model.", "href": "../../../source/surrogates/PolynomialChaos.html"}>>>
filename<<<{"description": "The name of the file which will be associated with the saved/loaded data."}>>> = 'normal_train_pc_out_pc_max.rd'
[]
[pr_max]
type = PolynomialRegressionSurrogate<<<{"description": "Evaluates polynomial regression model with coefficients computed from PolynomialRegressionTrainer.", "href": "../../../source/surrogates/PolynomialRegressionSurrogate.html"}>>>
filename<<<{"description": "The name of the file which will be associated with the saved/loaded data."}>>> = 'normal_train_pr_out_pr_max.rd'
[]
[]
(moose/modules/stochastic_tools/examples/surrogates/polynomial_regression/normal_surr.i)These surrogate models can be evaluated at the points defined in the testing sample batch. This is done using objects in the Reporters
block.
[Reporters<<<{"href": "../../../syntax/Reporters/index.html"}>>>]
[pc_max_res]
type = EvaluateSurrogate<<<{"description": "Tool for sampling surrogate models.", "href": "../../../source/reporters/EvaluateSurrogate.html"}>>>
model<<<{"description": "Name of surrogate models."}>>> = pc_max
sampler<<<{"description": "Sampler to use for evaluating surrogate models."}>>> = sample
parallel_type<<<{"description": "This parameter will determine how the stochastic data is gathered. It is common for outputting purposes that this parameter be set to ROOT, otherwise, many files will be produced showing the values on each processor. However, if there are lot of samples, gathering on root may be memory restrictive."}>>> = ROOT
[]
[pr_max_res]
type = EvaluateSurrogate<<<{"description": "Tool for sampling surrogate models.", "href": "../../../source/reporters/EvaluateSurrogate.html"}>>>
model<<<{"description": "Name of surrogate models."}>>> = pr_max
sampler<<<{"description": "Sampler to use for evaluating surrogate models."}>>> = sample
parallel_type<<<{"description": "This parameter will determine how the stochastic data is gathered. It is common for outputting purposes that this parameter be set to ROOT, otherwise, many files will be produced showing the values on each processor. However, if there are lot of samples, gathering on root may be memory restrictive."}>>> = ROOT
[]
[pr_max_stats]
type = StatisticsReporter<<<{"description": "Compute statistical values of a given VectorPostprocessor objects and vectors.", "href": "../../../source/reporters/StatisticsReporter.html"}>>>
reporters<<<{"description": "List of Reporter values to utilized for statistic computations."}>>> = 'pr_max_res/pr_max'
compute<<<{"description": "The statistic(s) to compute for each of the supplied vector postprocessors."}>>> = 'mean stddev'
[]
[pc_max_stats]
type = PolynomialChaosReporter<<<{"description": "Tool for extracting data from PolynomialChaos surrogates and computing statistics.", "href": "../../../source/reporters/PolynomialChaosReporter.html"}>>>
pc_name<<<{"description": "Name(s) of PolynomialChaos surrogate object(s)."}>>> = 'pc_max'
statistics<<<{"description": "Statistics to compute."}>>> = 'mean stddev'
[]
[]
(moose/modules/stochastic_tools/examples/surrogates/polynomial_regression/normal_surr.i)Results and Analysis
In this section the results from the different surrogate models are provided. They are compared to the reference results summarized in Table 1. A short analysis of the results is provided as well to showcase potential issues the user might encounter when using polynomial regression.
Uniform parameter distributions
First, the case with parameters having uniform distributions are investigated. The statistical moments obtained by the execution of the surrogate model are summarized in Table 2.
Table 2: Comparison of the statistical moments from different surrogate models assuming uniform parameter distributions.
Moment | Reference | Poly. Chaos | Poly. Reg. (deg. 4) | Poly. Reg. (deg. 8) |
---|---|---|---|---|
301.3219 | 301.3218 | 301.3234 | 301.3220 | |
5.9585 | 5.9586 | 5.9625 | 5.9655 |
It can be observed that the polynomial chaos surrogate gives results closer to the reference values. It is also visible that by increasing the polynomial order for the regression, the accuracy in the standard deviation slightly decreases. The histogram of the results is presented in Figure 1. It is important to mention that the results for the polynomial regression surrogate were obtained using max_degree=4
. It is apparent that the two methods give similar solutions.
Figure 1: Histogram of the maximum temperature coming from the Monte Carlo run using the surrogate models and assuming uniform parameter distributions.
Normal parameter distributions
Next, the case with normally distributed parameters is analyzed. The statistical moments of the results from testing the surrogate model are summarized in Table 3.
Table 3: Comparison of the statistical moments from different surrogate models assuming normal distributions.
Moment | Reference | Poly. Chaos | Poly. Reg. (deg. 4) | Poly. Reg. (deg. 8) |
---|---|---|---|---|
301.2547 | 301.3162 | 301.5663 | 301.5810 | |
10.0011 | 10.1125 | 11.2912 | 30.1675 |
It is visible that polynomial chaos surrogate gives the closest results to the reference values. Furthermore, the increase in the polynomial degree for the regression leads to a decrease in accuracy for both the mean and the standard deviation. This behavior is often referred to as overfitting which decreases the accuracy with the increasing model parameters. The histogram of the results is presented in Figure 2. It is important to mention that the results for the polynomial regression surrogate were obtained using max_degree=4
. It is apparent that the two methods give similar solutions, however the tails of the histogram of the polynomial regression are longer.
Figure 2: Histogram of the maximum temperature coming from the Monte Carlo run using the surrogate models and assuming normal parameter distributions.