LibtorchANNTrainer

Trains a simple neural network using libtorch.

Overview

This trainer is dedicated to train a LibtorchArtificialNeuralNet. For a detailed description of the neural network trained by this object, visit LibtorchArtificialNeuralNet. The user can customize the neural network in the trainer, however the optimization algorithm is hardcoded to be Adam.

Example Input File Syntax

Let us try to approximate the following function: over the domain. For this, we select points using a tensor product grid as follows:

[Samplers]
  [sample]
    type = CartesianProduct
    linear_space_items = '0 0.0125 5
                          0 0.0125 5
                          0 0.0125 5'
  []
[]
(modules/stochastic_tools/test/tests/surrogates/libtorch_nn/train.i)

Following this, the function is evaluated using a vector postprocessor:

[VectorPostprocessors]
  [values]
    type = GFunction
    sampler = sample
    q_vector = '0 0 0'
    execute_on = INITIAL
    outputs = none
  []
[]
(modules/stochastic_tools/test/tests/surrogates/libtorch_nn/train.i)

Once this is done, the corresponding inputs (from the sampler) and outputs (from the postprocessor) are handed to the neural net trainer to optimize the weights:

[Trainers]
  [train]
    type = LibtorchANNTrainer
    sampler = sample
    response = values/g_values
    num_epochs = 40
    num_batches = 10
    num_neurons_per_layer = '64 32'
    learning_rate = 0.001
    filename = mynet.pt
    read_from_file = false
    print_epoch_loss = 10
    activation_function = 'relu relu'
  []
[]
(modules/stochastic_tools/test/tests/surrogates/libtorch_nn/train.i)

We note that the user can set the architecture of the neural net using the "num_neurons_per_layer" and "activation_function" parameters. The optimization algorithm depends on several parameters: "num_batches" defines how many batches the training samples should be separated into, while "num_epochs" limits how many time we iterate over the batches.

The trained neural network can then be evaluated using LibtorchANNSurrogate.

Input Parameters

  • responseReporter value of response results, can be vpp with / or sampler column with 'sampler/col_'.

    C++ Type:ReporterName

    Controllable:No

    Description:Reporter value of response results, can be vpp with / or sampler column with 'sampler/col_'.

  • samplerSampler used to create predictor and response data.

    C++ Type:SamplerName

    Controllable:No

    Description:Sampler used to create predictor and response data.

Required Parameters

  • activation_functionrelu The type of activation functions to use. It is either one value or one value per hidden layer.

    Default:relu

    C++ Type:std::vector<std::string>

    Controllable:No

    Description:The type of activation functions to use. It is either one value or one value per hidden layer.

  • converged_reporterReporter value used to determine if a sample's multiapp solve converged.

    C++ Type:ReporterName

    Controllable:No

    Description:Reporter value used to determine if a sample's multiapp solve converged.

  • execute_onTIMESTEP_ENDThe list of flag(s) indicating when this object should be executed, the available options include NONE, INITIAL, LINEAR, NONLINEAR, TIMESTEP_END, TIMESTEP_BEGIN, FINAL, CUSTOM, ALWAYS.

    Default:TIMESTEP_END

    C++ Type:ExecFlagEnum

    Options:NONE, INITIAL, LINEAR, NONLINEAR, TIMESTEP_END, TIMESTEP_BEGIN, FINAL, CUSTOM, ALWAYS

    Controllable:No

    Description:The list of flag(s) indicating when this object should be executed, the available options include NONE, INITIAL, LINEAR, NONLINEAR, TIMESTEP_END, TIMESTEP_BEGIN, FINAL, CUSTOM, ALWAYS.

  • filenamenet.ptFilename used to output the neural net parameters.

    Default:net.pt

    C++ Type:std::string

    Controllable:No

    Description:Filename used to output the neural net parameters.

  • learning_rate0.001Learning rate (relaxation).

    Default:0.001

    C++ Type:double

    Controllable:No

    Description:Learning rate (relaxation).

  • num_batches1Number of batches.

    Default:1

    C++ Type:unsigned int

    Controllable:No

    Description:Number of batches.

  • num_epochs1Number of training epochs.

    Default:1

    C++ Type:unsigned int

    Controllable:No

    Description:Number of training epochs.

  • num_neurons_per_layerNumber of neurons per layer.

    C++ Type:std::vector<unsigned int>

    Controllable:No

    Description:Number of neurons per layer.

  • print_epoch_loss0Epoch training loss printing. 0 - no printing, 1 - every epoch, 10 - every 10th epoch.

    Default:0

    C++ Type:unsigned int

    Controllable:No

    Description:Epoch training loss printing. 0 - no printing, 1 - every epoch, 10 - every 10th epoch.

  • prop_getter_suffixAn optional suffix parameter that can be appended to any attempt to retrieve/get material properties. The suffix will be prepended with a '_' character.

    C++ Type:MaterialPropertyName

    Controllable:No

    Description:An optional suffix parameter that can be appended to any attempt to retrieve/get material properties. The suffix will be prepended with a '_' character.

  • read_from_fileFalseSwitch to allow reading old trained neural nets for further training.

    Default:False

    C++ Type:bool

    Controllable:No

    Description:Switch to allow reading old trained neural nets for further training.

  • rel_loss_tol0The relative loss where we stop the training of the neural net.

    Default:0

    C++ Type:double

    Controllable:No

    Description:The relative loss where we stop the training of the neural net.

  • response_typerealResponse data type.

    Default:real

    C++ Type:MooseEnum

    Options:real, vector_real

    Controllable:No

    Description:Response data type.

  • seed11Random number generator seed for stochastic optimizers.

    Default:11

    C++ Type:unsigned int

    Controllable:No

    Description:Random number generator seed for stochastic optimizers.

  • skip_unconverged_samplesFalseTrue to skip samples where the multiapp did not converge, 'stochastic_reporter' is required to do this.

    Default:False

    C++ Type:bool

    Controllable:No

    Description:True to skip samples where the multiapp did not converge, 'stochastic_reporter' is required to do this.

Optional Parameters

  • allow_duplicate_execution_on_initialFalseIn the case where this UserObject is depended upon by an initial condition, allow it to be executed twice during the initial setup (once before the IC and again after mesh adaptivity (if applicable).

    Default:False

    C++ Type:bool

    Controllable:No

    Description:In the case where this UserObject is depended upon by an initial condition, allow it to be executed twice during the initial setup (once before the IC and again after mesh adaptivity (if applicable).

  • control_tagsAdds user-defined labels for accessing object parameters via control logic.

    C++ Type:std::vector<std::string>

    Controllable:No

    Description:Adds user-defined labels for accessing object parameters via control logic.

  • enableTrueSet the enabled status of the MooseObject.

    Default:True

    C++ Type:bool

    Controllable:Yes

    Description:Set the enabled status of the MooseObject.

  • force_postauxFalseForces the UserObject to be executed in POSTAUX

    Default:False

    C++ Type:bool

    Controllable:No

    Description:Forces the UserObject to be executed in POSTAUX

  • force_preauxFalseForces the UserObject to be executed in PREAUX

    Default:False

    C++ Type:bool

    Controllable:No

    Description:Forces the UserObject to be executed in PREAUX

  • force_preicFalseForces the UserObject to be executed in PREIC during initial setup

    Default:False

    C++ Type:bool

    Controllable:No

    Description:Forces the UserObject to be executed in PREIC during initial setup

  • use_displaced_meshFalseWhether or not this object should use the displaced mesh for computation. Note that in the case this is true but no displacements are provided in the Mesh block the undisplaced mesh will still be used.

    Default:False

    C++ Type:bool

    Controllable:No

    Description:Whether or not this object should use the displaced mesh for computation. Note that in the case this is true but no displacements are provided in the Mesh block the undisplaced mesh will still be used.

Advanced Parameters

Input Files