AISActiveLearning (AISActiveLearning)

Adaptive Importance Sampler with Gaussian Process Active Learning.

Description

As stated in AdaptiveImportanceSampler, there are two steps in an Adaptive Importance Sampler (AIS): (1) usage of a Markov chain Monte Carlo (MCMC) sampler to learn the importance region; and (2) regular Monte Carlo sampling from the importance region for variance reduction when estimating a quantity of interest (QoI; like the failure probability). However, each MCMC or Monte Carlo sample is associated with a full model evaluation in the traditional AIS. While AIS considerably reduces the computational cost for estimating a QoI compared to a regular Monte Carlo sampler, even more computational gains are obtained by integrating active learning into AIS.

Active learning is based on the Gaussian process (GP) surrogate; see ActiveLearningGaussianProcess. Once the GP is trained with a few outputs from the full model, for every new input sample from either MCMC or Monte Carlo, a GP prediction is first made along with the prediction uncertainty. This prediction and the uncertainty are used to assess the prediction quality with the aid of active learning functions. If the GP prediction quality is good, we simply move onto a new input sample. Otherwise, we call the full model and re-train the GP with including the new sample in the training set to improve the future predictive performance.

Interaction between AISActiveLearning, ActiveLearningGPDecision, and AdaptiveMonteCarloDecision

Active learning in AIS primarily relies on three objects: AISActiveLearning, ActiveLearningGPDecision, and AdaptiveMonteCarloDecision. The interaction between these objects is presented in Figure 1 and is further discussed below.

Figure 1: Schematic of active learning in Adaptive Importance Sampling. The interaction between the three objects, AISActiveLearning, ActiveLearningGPDecision, and AdaptiveMonteCarloDecision, is presented.

The interaction between these three objects is straightforward to understand. Once the GP is trained, AISActiveLearning proposes a new input sample, either using MCMC or Monte Carlo. By default, ActiveLearningGPDecision uses a GP to predict the model output and also assesses the prediction quality.

The details on how the GP is initially trained and subsequently re-trained are discussed in ActiveLearningGPDecision.

commentnote:Start of the importance sampling procedure

The importance sampling using MCMC does not start until the GP initial training is finished.

Input file syntax

Once the interaction between the three objects is understood, the input file syntax is easy to follow.

The AISActiveLearning samplers block is largely similar to AdaptiveImportanceSampler. One difference is that the "flag_sample" parameter is requested to identify whether the GP prediction is good or bad. This dictates the next input proposal.

[Samplers<<<{"href": "../../syntax/Samplers/index.html"}>>>]
  [sample]
    type = AISActiveLearning<<<{"description": "Adaptive Importance Sampler with Gaussian Process Active Learning.", "href": "AISActiveLearning.html"}>>>
    distributions<<<{"description": "The distribution names to be sampled, the number of distributions provided defines the number of columns per matrix."}>>> = 'mu1 mu2'
    proposal_std<<<{"description": "Standard deviations of the proposal distributions"}>>> = '1.0 1.0'
    output_limit<<<{"description": "Limiting values of the VPPs"}>>> = 0.65
    num_samples_train<<<{"description": "Number of samples to learn the importance distribution"}>>> = 15
    num_importance_sampling_steps<<<{"description": "Number of importance sampling steps (after the importance distribution has been trained)"}>>> = 5
    std_factor<<<{"description": "Factor to be multiplied to the standard deviation of the importance samples"}>>> = 0.9
    initial_values<<<{"description": "Initial input values to get the importance sampler started"}>>> = '-0.103 1.239'
    inputs_reporter<<<{"description": "Reporter with input parameters."}>>> = 'adaptive_MC/inputs'
    use_absolute_value<<<{"description": "Use absolute value of the sub app output"}>>> = true
    flag_sample<<<{"description": "Flag samples if the surrogate prediction was inadequate."}>>> = 'conditional/flag_sample'
    seed<<<{"description": "Random number generator initial seed"}>>> = 9874
  []
[]
(moose/modules/stochastic_tools/test/tests/reporters/AISActiveLearning/ais_al.i)

The ActiveLearningGPDecision reporters block is the same as active learning in Monte Carlo sampling. See ActiveLearningGPDecision for the details.

[Reporters<<<{"href": "../../syntax/Reporters/index.html"}>>>]
  [conditional]
    type = ActiveLearningGPDecision<<<{"description": "Evaluates a GP surrogate model, determines its prediction quality, launches full model if GP prediction is inadequate, and retrains GP.", "href": "../reporters/ActiveLearningGPDecision.html"}>>>
    sampler<<<{"description": "The sampler object."}>>> = sample
    parallel_type<<<{"description": "This parameter will determine how the stochastic data is gathered. It is common for outputting purposes that this parameter be set to ROOT, otherwise, many files will be produced showing the values on each processor. However, if there are lot of samples, gathering on root may be memory restrictive."}>>> = ROOT
    execute_on<<<{"description": "The list of flag(s) indicating when this object should be executed. For a description of each flag, see https://mooseframework.inl.gov/source/interfaces/SetupInterface.html."}>>> = 'initial timestep_begin'
    flag_sample<<<{"description": "Flag samples."}>>> = 'flag_sample'
    inputs<<<{"description": "The inputs."}>>> = 'inputs'
    gp_mean<<<{"description": "The GP mean prediction."}>>> = 'gp_mean'
    gp_std<<<{"description": "The GP standard deviation."}>>> = 'gp_std'
    n_train<<<{"description": "Number of training steps."}>>> = 5
    al_gp<<<{"description": "Active learning GP trainer."}>>> = GP_al_trainer
    gp_evaluator<<<{"description": "Evaluate the trained GP."}>>> = GP_eval
    learning_function<<<{"description": "The learning function for active learning."}>>> = 'Ufunction'
    learning_function_parameter<<<{"description": "The learning function parameter."}>>> = 0.65
    learning_function_threshold<<<{"description": "The learning function threshold."}>>> = 2.0
  []
[]
(moose/modules/stochastic_tools/test/tests/reporters/AISActiveLearning/ais_al.i)

The AdaptiveMonteCarloDecision reporters block is also largely similar to AdaptiveImportanceSampler. One difference is, instead of using the full model outputs, the GP mean prediction is used.

[Reporters<<<{"href": "../../syntax/Reporters/index.html"}>>>]
  [adaptive_MC]
    type = AdaptiveMonteCarloDecision<<<{"description": "Generic reporter which decides whether or not to accept a proposed sample in Adaptive Monte Carlo type of algorithms.", "href": "../reporters/AdaptiveMonteCarloDecision.html"}>>>
    output_value<<<{"description": "Value of the model output from the SubApp."}>>> = conditional/gp_mean
    inputs<<<{"description": "Uncertain inputs to the model."}>>> = 'inputs'
    sampler<<<{"description": "The sampler object."}>>> = sample
    gp_decision<<<{"description": "The Gaussian Process decision reporter."}>>> = conditional
  []
[]
(moose/modules/stochastic_tools/test/tests/reporters/AISActiveLearning/ais_al.i)

Adaptive importance statistics reporter

The AdaptiveImportanceStats can also be used in AIS with active learning. The syntax is show below.

[Reporters<<<{"href": "../../syntax/Reporters/index.html"}>>>]
  [ais_stats]
    type = AdaptiveImportanceStats<<<{"description": "Reporter to compute statistics corresponding to the AdaptiveImportanceSampler.", "href": "../reporters/AdaptiveImportanceStats.html"}>>>
    output_value<<<{"description": "Value of the model output from the SubApp."}>>> = conditional/gp_mean
    sampler<<<{"description": "The sampler object."}>>> = sample
    flag_sample<<<{"description": "Flag samples if the surrogate prediction was inadequate."}>>> = 'conditional/flag_sample'
  []
[]
(moose/modules/stochastic_tools/test/tests/reporters/AISActiveLearning/ais_al.i)

Output format

Input Parameters

  • distributionsThe distribution names to be sampled, the number of distributions provided defines the number of columns per matrix.

    C++ Type:std::vector<DistributionName>

    Controllable:No

    Description:The distribution names to be sampled, the number of distributions provided defines the number of columns per matrix.

  • initial_valuesInitial input values to get the importance sampler started

    C++ Type:std::vector<double>

    Unit:(no unit assumed)

    Controllable:No

    Description:Initial input values to get the importance sampler started

  • inputs_reporterReporter with input parameters.

    C++ Type:ReporterName

    Controllable:No

    Description:Reporter with input parameters.

  • num_importance_sampling_stepsNumber of importance sampling steps (after the importance distribution has been trained)

    C++ Type:int

    Controllable:No

    Description:Number of importance sampling steps (after the importance distribution has been trained)

  • num_samples_trainNumber of samples to learn the importance distribution

    C++ Type:int

    Controllable:No

    Description:Number of samples to learn the importance distribution

  • output_limitLimiting values of the VPPs

    C++ Type:double

    Unit:(no unit assumed)

    Controllable:No

    Description:Limiting values of the VPPs

  • proposal_stdStandard deviations of the proposal distributions

    C++ Type:std::vector<double>

    Unit:(no unit assumed)

    Controllable:No

    Description:Standard deviations of the proposal distributions

  • std_factorFactor to be multiplied to the standard deviation of the importance samples

    C++ Type:double

    Unit:(no unit assumed)

    Controllable:No

    Description:Factor to be multiplied to the standard deviation of the importance samples

Required Parameters

  • execute_onLINEARThe list of flag(s) indicating when this object should be executed. For a description of each flag, see https://mooseframework.inl.gov/source/interfaces/SetupInterface.html.

    Default:LINEAR

    C++ Type:ExecFlagEnum

    Options:NONE, INITIAL, LINEAR, LINEAR_CONVERGENCE, NONLINEAR, NONLINEAR_CONVERGENCE, POSTCHECK, TIMESTEP_END, TIMESTEP_BEGIN, MULTIAPP_FIXED_POINT_END, MULTIAPP_FIXED_POINT_BEGIN, MULTIAPP_FIXED_POINT_CONVERGENCE, FINAL, CUSTOM, PRE_MULTIAPP_SETUP

    Controllable:No

    Description:The list of flag(s) indicating when this object should be executed. For a description of each flag, see https://mooseframework.inl.gov/source/interfaces/SetupInterface.html.

  • flag_sampleFlag samples if the surrogate prediction was inadequate.

    C++ Type:ReporterName

    Controllable:No

    Description:Flag samples if the surrogate prediction was inadequate.

  • limit_get_global_samples429496729The maximum allowed number of items in the DenseMatrix returned by getGlobalSamples method.

    Default:429496729

    C++ Type:unsigned long

    Controllable:No

    Description:The maximum allowed number of items in the DenseMatrix returned by getGlobalSamples method.

  • limit_get_local_samples429496729The maximum allowed number of items in the DenseMatrix returned by getLocalSamples method.

    Default:429496729

    C++ Type:unsigned long

    Controllable:No

    Description:The maximum allowed number of items in the DenseMatrix returned by getLocalSamples method.

  • limit_get_next_local_row429496729The maximum allowed number of items in the std::vector returned by getNextLocalRow method.

    Default:429496729

    C++ Type:unsigned long

    Controllable:No

    Description:The maximum allowed number of items in the std::vector returned by getNextLocalRow method.

  • max_procs_per_row4294967295This will ensure that the sampler is partitioned properly when 'MultiApp/*/max_procs_per_app' is specified. It is not recommended to use otherwise.

    Default:4294967295

    C++ Type:unsigned int

    Controllable:No

    Description:This will ensure that the sampler is partitioned properly when 'MultiApp/*/max_procs_per_app' is specified. It is not recommended to use otherwise.

  • min_procs_per_row1This will ensure that the sampler is partitioned properly when 'MultiApp/*/min_procs_per_app' is specified. It is not recommended to use otherwise.

    Default:1

    C++ Type:unsigned int

    Controllable:No

    Description:This will ensure that the sampler is partitioned properly when 'MultiApp/*/min_procs_per_app' is specified. It is not recommended to use otherwise.

  • num_random_seeds100000Initialize a certain number of random seeds. Change from the default only if you have to.

    Default:100000

    C++ Type:unsigned int

    Controllable:No

    Description:Initialize a certain number of random seeds. Change from the default only if you have to.

  • seed0Random number generator initial seed

    Default:0

    C++ Type:unsigned int

    Controllable:No

    Description:Random number generator initial seed

  • use_absolute_valueFalseUse absolute value of the sub app output

    Default:False

    C++ Type:bool

    Controllable:No

    Description:Use absolute value of the sub app output

Optional Parameters

  • control_tagsAdds user-defined labels for accessing object parameters via control logic.

    C++ Type:std::vector<std::string>

    Controllable:No

    Description:Adds user-defined labels for accessing object parameters via control logic.

  • enableTrueSet the enabled status of the MooseObject.

    Default:True

    C++ Type:bool

    Controllable:No

    Description:Set the enabled status of the MooseObject.

Advanced Parameters