LibtorchANNTrainer

Trains a simple neural network using libtorch.

Overview

This trainer is dedicated to train a LibtorchArtificialNeuralNet. For a detailed description of the neural network trained by this object, visit LibtorchArtificialNeuralNet. The user can customize the neural network in the trainer, however the optimization algorithm is hardcoded to be Adam.

Example Input File Syntax

Let us try to approximate the following function: $var element = document.getElementById("moose-equation-de93be4f-a73b-4bbb-9b6f-aabb32152664");katex.render("y = \\Pi_{i=1}^3|4x_i-2|", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ over the $var element = document.getElementById("moose-equation-d7820887-a8ff-4381-8204-7910a8caee63");katex.render("[0,0.05]^3", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ domain. For this, we select $var element = document.getElementById("moose-equation-90ba5e32-97c7-4fab-86a7-4a6a7291a485");katex.render("125 (5 \\times 5 \\times 5)", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ points using a tensor product grid as follows:

[Samplers<<<{"href": "../../../syntax/Samplers/index.html"}>>>]
  [sample]
    type = CartesianProduct<<<{"description": "Provides complete Cartesian product for the supplied variables.", "href": "../../samplers/CartesianProductSampler.html"}>>>
    linear_space_items<<<{"description": "A list of triplets, each item should include the min, step size, and number of steps."}>>> = '0 0.0125 5
                          0 0.0125 5
                          0 0.0125 5'
  []
[]

Following this, the function is evaluated using a vector postprocessor:

[VectorPostprocessors<<<{"href": "../../../syntax/VectorPostprocessors/index.html"}>>>]
  [values]
    type = GFunction
    sampler = sample
    q_vector = '0 0 0'
    execute_on = INITIAL
    outputs = none
  []
[]

Once this is done, the corresponding inputs (from the sampler) and outputs (from the postprocessor) are handed to the neural net trainer to optimize the weights:

[Trainers<<<{"href": "../../../syntax/Trainers/index.html"}>>>]
  [train]
    type = LibtorchANNTrainer<<<{"description": "Trains a simple neural network using libtorch.", "href": "LibtorchANNTrainer.html"}>>>
    sampler<<<{"description": "Sampler used to create predictor and response data."}>>> = sample
    response<<<{"description": "Reporter value of response results, can be vpp with <vpp_name>/<vector_name> or sampler column with 'sampler/col_<index>'."}>>> = values/g_values
    num_epochs<<<{"description": "Number of training epochs."}>>> = 40
    num_batches<<<{"description": "Number of batches."}>>> = 10
    num_neurons_per_layer<<<{"description": "Number of neurons per layer."}>>> = '64 32'
    learning_rate<<<{"description": "Learning rate (relaxation)."}>>> = 0.001
    nn_filename<<<{"description": "Filename used to output the neural net parameters."}>>> = mynet.pt
    read_from_file<<<{"description": "Switch to allow reading old trained neural nets for further training."}>>> = false
    print_epoch_loss<<<{"description": "Epoch training loss printing. 0 - no printing, 1 - every epoch, 10 - every 10th epoch."}>>> = 10
    activation_function<<<{"description": "The type of activation functions to use. It is either one value or one value per hidden layer."}>>> = 'relu relu'
    max_processes<<<{"description": "The maximum number of parallel processes that the trainer will use."}>>> = 1
    standardize_input<<<{"description": "Standardize (center and scale) training inputs (x values)"}>>> = false
    standardize_output<<<{"description": "Standardize (center and scale) training outputs (y values)"}>>> = false
  []
[]

We note that the user can set the architecture of the neural net using the "num_neurons_per_layer" and "activation_function" parameters. The optimization algorithm depends on several parameters: "num_batches" defines how many batches the training samples should be separated into, while "num_epochs" limits how many time we iterate over the batches.

The trained neural network can then be evaluated using LibtorchANNSurrogate.

Input Parameters

responseReporter value of response results, can be vpp with / or sampler column with 'sampler/col_'.
C++ Type:ReporterName
Controllable:No
Description:Reporter value of response results, can be vpp with / or sampler column with 'sampler/col_'.
samplerSampler used to create predictor and response data.
C++ Type:SamplerName
Controllable:No
Description:Sampler used to create predictor and response data.

Required Parameters

activation_functionrelu The type of activation functions to use. It is either one value or one value per hidden layer.
Default:relu
C++ Type:std::vector<std::string>
Controllable:No
Description:The type of activation functions to use. It is either one value or one value per hidden layer.
converged_reporterReporter value used to determine if a sample's multiapp solve converged.
C++ Type:ReporterName
Controllable:No
Description:Reporter value used to determine if a sample's multiapp solve converged.
cv_n_trials1Number of repeated trials of cross-validation to perform.
Default:1
C++ Type:unsigned int
Controllable:No
Description:Number of repeated trials of cross-validation to perform.
cv_seed4294967295Seed used to initialize random number generator for data splitting during cross validation.
Default:4294967295
C++ Type:unsigned int
Controllable:No
Description:Seed used to initialize random number generator for data splitting during cross validation.
cv_splits10Number of splits (k) to use in k-fold cross-validation.
Default:10
C++ Type:unsigned int
Controllable:No
Description:Number of splits (k) to use in k-fold cross-validation.
cv_surrogateName of Surrogate object used for model cross-validation.
C++ Type:UserObjectName
Controllable:No
Description:Name of Surrogate object used for model cross-validation.
cv_typenoneCross-validation method to use for dataset. Options are 'none' or 'k_fold'.
Default:none
C++ Type:MooseEnum
Options:none, k_fold
Controllable:No
Description:Cross-validation method to use for dataset. Options are 'none' or 'k_fold'.
filenameThe name of the file which will be associated with the saved/loaded data.
C++ Type:FileName
Controllable:No
Description:The name of the file which will be associated with the saved/loaded data.
learning_rate0.001Learning rate (relaxation).
Default:0.001
C++ Type:double
Unit:(no unit assumed)
Controllable:No
Description:Learning rate (relaxation).
max_processes1The maximum number of parallel processes that the trainer will use.
Default:1
C++ Type:unsigned int
Controllable:No
Description:The maximum number of parallel processes that the trainer will use.
nn_filenamenet.ptFilename used to output the neural net parameters.
Default:net.pt
C++ Type:std::string
Controllable:No
Description:Filename used to output the neural net parameters.
num_batches1Number of batches.
Default:1
C++ Type:unsigned int
Controllable:No
Description:Number of batches.
num_epochs1Number of training epochs.
Default:1
C++ Type:unsigned int
Controllable:No
Description:Number of training epochs.
num_neurons_per_layerNumber of neurons per layer.
C++ Type:std::vector<unsigned int>
Controllable:No
Description:Number of neurons per layer.
predictor_colsSampler columns used as the independent random variables, If 'predictors' and 'predictor_cols' are both empty, all sampler columns are used.
C++ Type:std::vector<unsigned int>
Controllable:No
Description:Sampler columns used as the independent random variables, If 'predictors' and 'predictor_cols' are both empty, all sampler columns are used.
predictorsReporter values used as the independent random variables, If 'predictors' and 'predictor_cols' are both empty, all sampler columns are used.
C++ Type:std::vector<ReporterName>
Controllable:No
Description:Reporter values used as the independent random variables, If 'predictors' and 'predictor_cols' are both empty, all sampler columns are used.
print_epoch_loss0Epoch training loss printing. 0 - no printing, 1 - every epoch, 10 - every 10th epoch.
Default:0
C++ Type:unsigned int
Controllable:No
Description:Epoch training loss printing. 0 - no printing, 1 - every epoch, 10 - every 10th epoch.
read_from_fileFalseSwitch to allow reading old trained neural nets for further training.
Default:False
C++ Type:bool
Controllable:No
Description:Switch to allow reading old trained neural nets for further training.
rel_loss_tol0The relative loss where we stop the training of the neural net.
Default:0
C++ Type:double
Unit:(no unit assumed)
Controllable:No
Description:The relative loss where we stop the training of the neural net.
seed11Random number generator seed for stochastic optimizers.
Default:11
C++ Type:unsigned int
Controllable:No
Description:Random number generator seed for stochastic optimizers.
skip_unconverged_samplesFalseTrue to skip samples where the multiapp did not converge, 'stochastic_reporter' is required to do this.
Default:False
C++ Type:bool
Controllable:No
Description:True to skip samples where the multiapp did not converge, 'stochastic_reporter' is required to do this.
standardize_inputTrueStandardize (center and scale) training inputs (x values)
Default:True
C++ Type:bool
Controllable:No
Description:Standardize (center and scale) training inputs (x values)
standardize_outputTrueStandardize (center and scale) training outputs (y values)
Default:True
C++ Type:bool
Controllable:No
Description:Standardize (center and scale) training outputs (y values)

Optional Parameters

allow_duplicate_execution_on_initialFalseIn the case where this UserObject is depended upon by an initial condition, allow it to be executed twice during the initial setup (once before the IC and again after mesh adaptivity (if applicable).
Default:False
C++ Type:bool
Controllable:No
Description:In the case where this UserObject is depended upon by an initial condition, allow it to be executed twice during the initial setup (once before the IC and again after mesh adaptivity (if applicable).
execute_onTIMESTEP_ENDThe list of flag(s) indicating when this object should be executed. For a description of each flag, see https://mooseframework.inl.gov/source/interfaces/SetupInterface.html.
Default:TIMESTEP_END
C++ Type:ExecFlagEnum
Options:XFEM_MARK, FORWARD, ADJOINT, HOMOGENEOUS_FORWARD, ADJOINT_TIMESTEP_BEGIN, ADJOINT_TIMESTEP_END, NONE, INITIAL, LINEAR, LINEAR_CONVERGENCE, NONLINEAR, NONLINEAR_CONVERGENCE, POSTCHECK, TIMESTEP_END, TIMESTEP_BEGIN, MULTIAPP_FIXED_POINT_END, MULTIAPP_FIXED_POINT_BEGIN, MULTIAPP_FIXED_POINT_CONVERGENCE, FINAL, CUSTOM
Controllable:No
Description:The list of flag(s) indicating when this object should be executed. For a description of each flag, see https://mooseframework.inl.gov/source/interfaces/SetupInterface.html.
execution_order_group0Execution order groups are executed in increasing order (e.g., the lowest number is executed first). Note that negative group numbers may be used to execute groups before the default (0) group. Please refer to the user object documentation for ordering of user object execution within a group.
Default:0
C++ Type:int
Controllable:No
Description:Execution order groups are executed in increasing order (e.g., the lowest number is executed first). Note that negative group numbers may be used to execute groups before the default (0) group. Please refer to the user object documentation for ordering of user object execution within a group.
force_postauxFalseForces the UserObject to be executed in POSTAUX
Default:False
C++ Type:bool
Controllable:No
Description:Forces the UserObject to be executed in POSTAUX
force_preauxFalseForces the UserObject to be executed in PREAUX
Default:False
C++ Type:bool
Controllable:No
Description:Forces the UserObject to be executed in PREAUX
force_preicFalseForces the UserObject to be executed in PREIC during initial setup
Default:False
C++ Type:bool
Controllable:No
Description:Forces the UserObject to be executed in PREIC during initial setup

Execution Scheduling Parameters

control_tagsAdds user-defined labels for accessing object parameters via control logic.
C++ Type:std::vector<std::string>
Controllable:No
Description:Adds user-defined labels for accessing object parameters via control logic.
enableTrueSet the enabled status of the MooseObject.
Default:True
C++ Type:bool
Controllable:Yes
Description:Set the enabled status of the MooseObject.
use_displaced_meshFalseWhether or not this object should use the displaced mesh for computation. Note that in the case this is true but no displacements are provided in the Mesh block the undisplaced mesh will still be used.
Default:False
C++ Type:bool
Controllable:No
Description:Whether or not this object should use the displaced mesh for computation. Note that in the case this is true but no displacements are provided in the Mesh block the undisplaced mesh will still be used.

Advanced Parameters

prop_getter_suffixAn optional suffix parameter that can be appended to any attempt to retrieve/get material properties. The suffix will be prepended with a '_' character.
C++ Type:MaterialPropertyName
Unit:(no unit assumed)
Controllable:No
Description:An optional suffix parameter that can be appended to any attempt to retrieve/get material properties. The suffix will be prepended with a '_' character.
use_interpolated_stateFalseFor the old and older state use projected material properties interpolated at the quadrature points. To set up projection use the ProjectedStatefulMaterialStorageAction.
Default:False
C++ Type:bool
Controllable:No
Description:For the old and older state use projected material properties interpolated at the quadrature points. To set up projection use the ProjectedStatefulMaterialStorageAction.

Material Property Retrieval Parameters

Input Files

(modules/stochastic_tools/examples/surrogates/cross_validation/all_trainers_uniform_cv.i)
(modules/stochastic_tools/test/tests/surrogates/libtorch_nn/train_and_evaluate.i)
(modules/stochastic_tools/test/tests/surrogates/libtorch_nn/train.i)
(modules/stochastic_tools/test/tests/surrogates/libtorch_nn/retrain.i)

(modules/stochastic_tools/test/tests/surrogates/libtorch_nn/train.i)

[StochasticTools]
[]

[Samplers]
  [sample]
    type = CartesianProduct
    linear_space_items = '0 0.0125 5
                          0 0.0125 5
                          0 0.0125 5'
  []
[]

[VectorPostprocessors]
  [values]
    type = GFunction
    sampler = sample
    q_vector = '0 0 0'
    execute_on = INITIAL
    outputs = none
  []
[]

[Trainers]
  [train]
    type = LibtorchANNTrainer
    sampler = sample
    response = values/g_values
    num_epochs = 40
    num_batches = 10
    num_neurons_per_layer = '64 32'
    learning_rate = 0.001
    nn_filename = mynet.pt
    read_from_file = false
    print_epoch_loss = 10
    activation_function = 'relu relu'
    max_processes = 1
    standardize_input = false
    standardize_output = false
  []
[]

[Outputs]
  [out]
    type = SurrogateTrainerOutput
    trainers = 'train'
    execute_on = FINAL
  []
[]

(modules/stochastic_tools/test/tests/surrogates/libtorch_nn/train.i)

[StochasticTools]
[]

[Samplers]
  [sample]
    type = CartesianProduct
    linear_space_items = '0 0.0125 5
                          0 0.0125 5
                          0 0.0125 5'
  []
[]

[VectorPostprocessors]
  [values]
    type = GFunction
    sampler = sample
    q_vector = '0 0 0'
    execute_on = INITIAL
    outputs = none
  []
[]

[Trainers]
  [train]
    type = LibtorchANNTrainer
    sampler = sample
    response = values/g_values
    num_epochs = 40
    num_batches = 10
    num_neurons_per_layer = '64 32'
    learning_rate = 0.001
    nn_filename = mynet.pt
    read_from_file = false
    print_epoch_loss = 10
    activation_function = 'relu relu'
    max_processes = 1
    standardize_input = false
    standardize_output = false
  []
[]

[Outputs]
  [out]
    type = SurrogateTrainerOutput
    trainers = 'train'
    execute_on = FINAL
  []
[]

(modules/stochastic_tools/test/tests/surrogates/libtorch_nn/train.i)

[StochasticTools]
[]

[Samplers]
  [sample]
    type = CartesianProduct
    linear_space_items = '0 0.0125 5
                          0 0.0125 5
                          0 0.0125 5'
  []
[]

[VectorPostprocessors]
  [values]
    type = GFunction
    sampler = sample
    q_vector = '0 0 0'
    execute_on = INITIAL
    outputs = none
  []
[]

[Trainers]
  [train]
    type = LibtorchANNTrainer
    sampler = sample
    response = values/g_values
    num_epochs = 40
    num_batches = 10
    num_neurons_per_layer = '64 32'
    learning_rate = 0.001
    nn_filename = mynet.pt
    read_from_file = false
    print_epoch_loss = 10
    activation_function = 'relu relu'
    max_processes = 1
    standardize_input = false
    standardize_output = false
  []
[]

[Outputs]
  [out]
    type = SurrogateTrainerOutput
    trainers = 'train'
    execute_on = FINAL
  []
[]

(modules/stochastic_tools/examples/surrogates/cross_validation/all_trainers_uniform_cv.i)

[StochasticTools]
[]

[GlobalParams]
  sampler = cv_sampler
  response = results/response_data:max:value
  cv_type = "k_fold"
  cv_splits = 5
  cv_n_trials = 100
[]

[Distributions]
  [k_dist]
    type = Uniform
    lower_bound = 1
    upper_bound = 10
  []
  [q_dist]
    type = Uniform
    lower_bound = 9000
    upper_bound = 11000
  []
  [L_dist]
    type = Uniform
    lower_bound = 0.01
    upper_bound = 0.05
  []
  [Tinf_dist]
    type = Uniform
    lower_bound = 290
    upper_bound = 310
  []
[]

[Samplers]
  [cv_sampler]
    type = LatinHypercube
    distributions = 'k_dist q_dist L_dist Tinf_dist'
    num_rows = 1000
    execute_on = PRE_MULTIAPP_SETUP
  []
[]

[MultiApps]
  [cv_sub]
    type = SamplerFullSolveMultiApp
    input_files = all_sub.i
    mode = batch-reset
  []
[]

[Controls]
  [pr_cmdline]
    type = MultiAppSamplerControl
    multi_app = cv_sub
    param_names = 'Materials/conductivity/prop_values Kernels/source/value Mesh/xmax BCs/right/value'
  []
[]

[Transfers]
  [response_data]
    type = SamplerReporterTransfer
    from_multi_app = cv_sub
    stochastic_reporter = results
    from_reporter = 'max/value'
  []
[]

[Reporters]
  [results]
    type = StochasticReporter
    outputs = none
  []
  [cv_scores]
    type = CrossValidationScores
    models = 'pr_surr pc_surr np_surr gp_surr ann_surr'
    execute_on = FINAL
  []
[]

[Trainers]
  [pr_max]
    type = PolynomialRegressionTrainer
    regression_type = "ols"
    max_degree = 3
    cv_surrogate = "pr_surr"
    execute_on = timestep_end
  []
  [pc_max]
    type = PolynomialChaosTrainer
    order = 3
    distributions = "k_dist q_dist L_dist Tinf_dist"
    cv_surrogate = "pc_surr"
    execute_on = timestep_end
  []
  [np_max]
    type = NearestPointTrainer
    cv_surrogate = "np_surr"
    execute_on = timestep_end
  []
  [gp_max]
    type = GaussianProcessTrainer
    covariance_function = 'rbf'
    standardize_params = 'true'
    standardize_data = 'true'
    cv_surrogate = "gp_surr"
    execute_on = timestep_end
  []
  [ann_max]
    type = LibtorchANNTrainer
    num_epochs = 100
    num_batches = 5
    num_neurons_per_layer = '64'
    learning_rate = 1e-2
    rel_loss_tol = 1e-4
    filename = mynet.pt
    read_from_file = false
    print_epoch_loss = 0
    activation_function = 'relu'
    cv_surrogate = "ann_surr"
    standardize_input = false
    standardize_output = false
  []
[]

[Covariance]
  [rbf]
    type = SquaredExponentialCovariance
    noise_variance = 3.79e-6
    signal_variance = 1 #Use a signal variance of 1 in the kernel
    length_factor = '5.34471 1.41191 5.90721 2.83723' #Select a length factor for each parameter
  []
[]

[Surrogates]
  [pr_surr]
    type = PolynomialRegressionSurrogate
    trainer = pr_max
  []
  [pc_surr]
    type = PolynomialChaos
    trainer = pc_max
  []
  [np_surr]
    type = NearestPointSurrogate
    trainer = np_max
  []
  [gp_surr]
    type = GaussianProcessSurrogate
    trainer = gp_max
  []
  [ann_surr]
    type = LibtorchANNSurrogate
    trainer = ann_max
  []
[]

[Outputs]
  [out]
    type = JSON
    execute_on = FINAL
  []
[]

(modules/stochastic_tools/test/tests/surrogates/libtorch_nn/train_and_evaluate.i)

[StochasticTools]
[]

[Samplers]
  [sample]
    type = CartesianProduct
    linear_space_items = '0 0.0125 5
                          0 0.0125 5
                          0 0.0125 5'
  []
  [test]
    type = CartesianProduct
    linear_space_items = '0 0.05 2
                          0 0.05 2
                          0 0.05 2'
  []
[]

[VectorPostprocessors]
  [values]
    type = GFunction
    sampler = sample
    q_vector = '0 0 0'
    execute_on = INITIAL
    outputs = none
  []
[]

[Trainers]
  [train]
    type = LibtorchANNTrainer
    sampler = sample
    response = values/g_values
    num_epochs = 40
    num_batches = 10
    num_neurons_per_layer = '64 32'
    learning_rate = 0.001
    nn_filename = mynet_tne.pt
    read_from_file = false
    print_epoch_loss = 10
    max_processes = 1
    standardize_input = false
    standardize_output = false
  []
[]

[Surrogates]
  [surr]
    type = LibtorchANNSurrogate
    trainer = train
  []
[]

[Reporters]
  [results]
    type = EvaluateSurrogate
    model = surr
    sampler = test
    execute_on = FINAL
    parallel_type = ROOT
  []
[]

[Outputs]
  csv = true
  execute_on = FINAL
[]

(modules/stochastic_tools/test/tests/surrogates/libtorch_nn/train.i)

[StochasticTools]
[]

[Samplers]
  [sample]
    type = CartesianProduct
    linear_space_items = '0 0.0125 5
                          0 0.0125 5
                          0 0.0125 5'
  []
[]

[VectorPostprocessors]
  [values]
    type = GFunction
    sampler = sample
    q_vector = '0 0 0'
    execute_on = INITIAL
    outputs = none
  []
[]

[Trainers]
  [train]
    type = LibtorchANNTrainer
    sampler = sample
    response = values/g_values
    num_epochs = 40
    num_batches = 10
    num_neurons_per_layer = '64 32'
    learning_rate = 0.001
    nn_filename = mynet.pt
    read_from_file = false
    print_epoch_loss = 10
    activation_function = 'relu relu'
    max_processes = 1
    standardize_input = false
    standardize_output = false
  []
[]

[Outputs]
  [out]
    type = SurrogateTrainerOutput
    trainers = 'train'
    execute_on = FINAL
  []
[]

(modules/stochastic_tools/test/tests/surrogates/libtorch_nn/retrain.i)

[StochasticTools]
[]

[Samplers]
  [train_sample]
    type = CartesianProduct
    linear_space_items = '0 0.0125 5
                          0 0.0125 5
                          0 0.0125 5'
  []
  [test_sample]
    type = CartesianProduct
    linear_space_items = '0 0.05 2
                          0 0.05 2
                          0 0.05 2'
  []
[]

[VectorPostprocessors]
  [values]
    type = GFunction
    sampler = train_sample
    q_vector = '0 0 0'
    execute_on = INITIAL
    outputs = none
  []
[]

[Trainers]
  [train]
    type = LibtorchANNTrainer
    sampler = train_sample
    response = values/g_values
    num_epochs = 40
    num_batches = 10
    num_neurons_per_layer = '64 32'
    learning_rate = 0.001
    nn_filename = mynet.pt
    read_from_file = false
    print_epoch_loss = 10
    standardize_input = false
    standardize_output = false
  []
[]

[Surrogates]
  [surr]
    type = LibtorchANNSurrogate
    trainer = train
  []
[]

[Reporters]
  [results]
    type = EvaluateSurrogate
    model = surr
    sampler = test_sample
    execute_on = FINAL
    parallel_type = ROOT
  []
[]

[Outputs]
  csv = true
  execute_on = FINAL
[]

Overview
Example Input File Syntax
Input Parameters
Input Files