LibtorchANNTrainer

Trains a simple neural network using libtorch.

Overview

This trainer is dedicated to train a LibtorchArtificialNeuralNet. For a detailed description of the neural network trained by this object, visit LibtorchArtificialNeuralNet. The user can customize the neural network in the trainer, however the optimization algorithm is hardcoded to be Adam.

Example Input File Syntax

Let us try to approximate the following function: over the domain. For this, we select points using a tensor product grid as follows:

[Samplers<<<{"href": "../../../syntax/Samplers/index.html"}>>>]
  [sample]
    type = CartesianProduct<<<{"description": "Provides complete Cartesian product for the supplied variables.", "href": "../../samplers/CartesianProductSampler.html"}>>>
    linear_space_items<<<{"description": "A list of triplets, each item should include the min, step size, and number of steps."}>>> = '0 0.0125 5
                          0 0.0125 5
                          0 0.0125 5'
  []
[]
(modules/stochastic_tools/test/tests/surrogates/libtorch_nn/train.i)

Following this, the function is evaluated using a vector postprocessor:

[VectorPostprocessors<<<{"href": "../../../syntax/VectorPostprocessors/index.html"}>>>]
  [values]
    type = GFunction
    sampler = sample
    q_vector = '0 0 0'
    execute_on = INITIAL
    outputs = none
  []
[]
(modules/stochastic_tools/test/tests/surrogates/libtorch_nn/train.i)

Once this is done, the corresponding inputs (from the sampler) and outputs (from the postprocessor) are handed to the neural net trainer to optimize the weights:

[Trainers<<<{"href": "../../../syntax/Trainers/index.html"}>>>]
  [train]
    type = LibtorchANNTrainer<<<{"description": "Trains a simple neural network using libtorch.", "href": "LibtorchANNTrainer.html"}>>>
    sampler<<<{"description": "Sampler used to create predictor and response data."}>>> = sample
    response<<<{"description": "Reporter value of response results, can be vpp with <vpp_name>/<vector_name> or sampler column with 'sampler/col_<index>'."}>>> = values/g_values
    num_epochs<<<{"description": "Number of training epochs."}>>> = 40
    num_batches<<<{"description": "Number of batches."}>>> = 10
    num_neurons_per_layer<<<{"description": "Number of neurons per layer."}>>> = '64 32'
    learning_rate<<<{"description": "Learning rate (relaxation)."}>>> = 0.001
    nn_filename<<<{"description": "Filename used to output the neural net parameters."}>>> = mynet.pt
    read_from_file<<<{"description": "Switch to allow reading old trained neural nets for further training."}>>> = false
    print_epoch_loss<<<{"description": "Epoch training loss printing. 0 - no printing, 1 - every epoch, 10 - every 10th epoch."}>>> = 10
    activation_function<<<{"description": "The type of activation functions to use. It is either one value or one value per hidden layer."}>>> = 'relu relu'
    max_processes<<<{"description": "The maximum number of parallel processes that the trainer will use."}>>> = 1
    standardize_input<<<{"description": "Standardize (center and scale) training inputs (x values)"}>>> = false
    standardize_output<<<{"description": "Standardize (center and scale) training outputs (y values)"}>>> = false
  []
[]
(modules/stochastic_tools/test/tests/surrogates/libtorch_nn/train.i)

We note that the user can set the architecture of the neural net using the "num_neurons_per_layer" and "activation_function" parameters. The optimization algorithm depends on several parameters: "num_batches" defines how many batches the training samples should be separated into, while "num_epochs" limits how many time we iterate over the batches.

The trained neural network can then be evaluated using LibtorchANNSurrogate.

Input Parameters

  • responseReporter value of response results, can be vpp with / or sampler column with 'sampler/col_'.

    C++ Type:ReporterName

    Controllable:No

    Description:Reporter value of response results, can be vpp with / or sampler column with 'sampler/col_'.

  • samplerSampler used to create predictor and response data.

    C++ Type:SamplerName

    Controllable:No

    Description:Sampler used to create predictor and response data.

Required Parameters

  • activation_functionrelu The type of activation functions to use. It is either one value or one value per hidden layer.

    Default:relu

    C++ Type:std::vector<std::string>

    Controllable:No

    Description:The type of activation functions to use. It is either one value or one value per hidden layer.

  • converged_reporterReporter value used to determine if a sample's multiapp solve converged.

    C++ Type:ReporterName

    Controllable:No

    Description:Reporter value used to determine if a sample's multiapp solve converged.

  • cv_n_trials1Number of repeated trials of cross-validation to perform.

    Default:1

    C++ Type:unsigned int

    Controllable:No

    Description:Number of repeated trials of cross-validation to perform.

  • cv_seed4294967295Seed used to initialize random number generator for data splitting during cross validation.

    Default:4294967295

    C++ Type:unsigned int

    Controllable:No

    Description:Seed used to initialize random number generator for data splitting during cross validation.

  • cv_splits10Number of splits (k) to use in k-fold cross-validation.

    Default:10

    C++ Type:unsigned int

    Controllable:No

    Description:Number of splits (k) to use in k-fold cross-validation.

  • cv_surrogateName of Surrogate object used for model cross-validation.

    C++ Type:UserObjectName

    Controllable:No

    Description:Name of Surrogate object used for model cross-validation.

  • cv_typenoneCross-validation method to use for dataset. Options are 'none' or 'k_fold'.

    Default:none

    C++ Type:MooseEnum

    Options:none, k_fold

    Controllable:No

    Description:Cross-validation method to use for dataset. Options are 'none' or 'k_fold'.

  • filenameThe name of the file which will be associated with the saved/loaded data.

    C++ Type:FileName

    Controllable:No

    Description:The name of the file which will be associated with the saved/loaded data.

  • learning_rate0.001Learning rate (relaxation).

    Default:0.001

    C++ Type:double

    Unit:(no unit assumed)

    Controllable:No

    Description:Learning rate (relaxation).

  • max_processes1The maximum number of parallel processes that the trainer will use.

    Default:1

    C++ Type:unsigned int

    Controllable:No

    Description:The maximum number of parallel processes that the trainer will use.

  • nn_filenamenet.ptFilename used to output the neural net parameters.

    Default:net.pt

    C++ Type:std::string

    Controllable:No

    Description:Filename used to output the neural net parameters.

  • num_batches1Number of batches.

    Default:1

    C++ Type:unsigned int

    Controllable:No

    Description:Number of batches.

  • num_epochs1Number of training epochs.

    Default:1

    C++ Type:unsigned int

    Controllable:No

    Description:Number of training epochs.

  • num_neurons_per_layerNumber of neurons per layer.

    C++ Type:std::vector<unsigned int>

    Controllable:No

    Description:Number of neurons per layer.

  • predictor_colsSampler columns used as the independent random variables, If 'predictors' and 'predictor_cols' are both empty, all sampler columns are used.

    C++ Type:std::vector<unsigned int>

    Controllable:No

    Description:Sampler columns used as the independent random variables, If 'predictors' and 'predictor_cols' are both empty, all sampler columns are used.

  • predictorsReporter values used as the independent random variables, If 'predictors' and 'predictor_cols' are both empty, all sampler columns are used.

    C++ Type:std::vector<ReporterName>

    Controllable:No

    Description:Reporter values used as the independent random variables, If 'predictors' and 'predictor_cols' are both empty, all sampler columns are used.

  • print_epoch_loss0Epoch training loss printing. 0 - no printing, 1 - every epoch, 10 - every 10th epoch.

    Default:0

    C++ Type:unsigned int

    Controllable:No

    Description:Epoch training loss printing. 0 - no printing, 1 - every epoch, 10 - every 10th epoch.

  • read_from_fileFalseSwitch to allow reading old trained neural nets for further training.

    Default:False

    C++ Type:bool

    Controllable:No

    Description:Switch to allow reading old trained neural nets for further training.

  • rel_loss_tol0The relative loss where we stop the training of the neural net.

    Default:0

    C++ Type:double

    Unit:(no unit assumed)

    Controllable:No

    Description:The relative loss where we stop the training of the neural net.

  • seed11Random number generator seed for stochastic optimizers.

    Default:11

    C++ Type:unsigned int

    Controllable:No

    Description:Random number generator seed for stochastic optimizers.

  • skip_unconverged_samplesFalseTrue to skip samples where the multiapp did not converge, 'stochastic_reporter' is required to do this.

    Default:False

    C++ Type:bool

    Controllable:No

    Description:True to skip samples where the multiapp did not converge, 'stochastic_reporter' is required to do this.

  • standardize_inputTrueStandardize (center and scale) training inputs (x values)

    Default:True

    C++ Type:bool

    Controllable:No

    Description:Standardize (center and scale) training inputs (x values)

  • standardize_outputTrueStandardize (center and scale) training outputs (y values)

    Default:True

    C++ Type:bool

    Controllable:No

    Description:Standardize (center and scale) training outputs (y values)

Optional Parameters

  • allow_duplicate_execution_on_initialFalseIn the case where this UserObject is depended upon by an initial condition, allow it to be executed twice during the initial setup (once before the IC and again after mesh adaptivity (if applicable).

    Default:False

    C++ Type:bool

    Controllable:No

    Description:In the case where this UserObject is depended upon by an initial condition, allow it to be executed twice during the initial setup (once before the IC and again after mesh adaptivity (if applicable).

  • execute_onTIMESTEP_ENDThe list of flag(s) indicating when this object should be executed. For a description of each flag, see https://mooseframework.inl.gov/source/interfaces/SetupInterface.html.

    Default:TIMESTEP_END

    C++ Type:ExecFlagEnum

    Options:XFEM_MARK, FORWARD, ADJOINT, HOMOGENEOUS_FORWARD, ADJOINT_TIMESTEP_BEGIN, ADJOINT_TIMESTEP_END, NONE, INITIAL, LINEAR, LINEAR_CONVERGENCE, NONLINEAR, NONLINEAR_CONVERGENCE, POSTCHECK, TIMESTEP_END, TIMESTEP_BEGIN, MULTIAPP_FIXED_POINT_END, MULTIAPP_FIXED_POINT_BEGIN, MULTIAPP_FIXED_POINT_CONVERGENCE, FINAL, CUSTOM

    Controllable:No

    Description:The list of flag(s) indicating when this object should be executed. For a description of each flag, see https://mooseframework.inl.gov/source/interfaces/SetupInterface.html.

  • execution_order_group0Execution order groups are executed in increasing order (e.g., the lowest number is executed first). Note that negative group numbers may be used to execute groups before the default (0) group. Please refer to the user object documentation for ordering of user object execution within a group.

    Default:0

    C++ Type:int

    Controllable:No

    Description:Execution order groups are executed in increasing order (e.g., the lowest number is executed first). Note that negative group numbers may be used to execute groups before the default (0) group. Please refer to the user object documentation for ordering of user object execution within a group.

  • force_postauxFalseForces the UserObject to be executed in POSTAUX

    Default:False

    C++ Type:bool

    Controllable:No

    Description:Forces the UserObject to be executed in POSTAUX

  • force_preauxFalseForces the UserObject to be executed in PREAUX

    Default:False

    C++ Type:bool

    Controllable:No

    Description:Forces the UserObject to be executed in PREAUX

  • force_preicFalseForces the UserObject to be executed in PREIC during initial setup

    Default:False

    C++ Type:bool

    Controllable:No

    Description:Forces the UserObject to be executed in PREIC during initial setup

Execution Scheduling Parameters

  • control_tagsAdds user-defined labels for accessing object parameters via control logic.

    C++ Type:std::vector<std::string>

    Controllable:No

    Description:Adds user-defined labels for accessing object parameters via control logic.

  • enableTrueSet the enabled status of the MooseObject.

    Default:True

    C++ Type:bool

    Controllable:Yes

    Description:Set the enabled status of the MooseObject.

  • use_displaced_meshFalseWhether or not this object should use the displaced mesh for computation. Note that in the case this is true but no displacements are provided in the Mesh block the undisplaced mesh will still be used.

    Default:False

    C++ Type:bool

    Controllable:No

    Description:Whether or not this object should use the displaced mesh for computation. Note that in the case this is true but no displacements are provided in the Mesh block the undisplaced mesh will still be used.

Advanced Parameters

  • prop_getter_suffixAn optional suffix parameter that can be appended to any attempt to retrieve/get material properties. The suffix will be prepended with a '_' character.

    C++ Type:MaterialPropertyName

    Unit:(no unit assumed)

    Controllable:No

    Description:An optional suffix parameter that can be appended to any attempt to retrieve/get material properties. The suffix will be prepended with a '_' character.

  • use_interpolated_stateFalseFor the old and older state use projected material properties interpolated at the quadrature points. To set up projection use the ProjectedStatefulMaterialStorageAction.

    Default:False

    C++ Type:bool

    Controllable:No

    Description:For the old and older state use projected material properties interpolated at the quadrature points. To set up projection use the ProjectedStatefulMaterialStorageAction.

Material Property Retrieval Parameters

Input Files