- action_standard_deviationsStandard deviation value used while sampling the actions.
C++ Type:std::vector<double>
Unit:(no unit assumed)
Controllable:No
Description:Standard deviation value used while sampling the actions.
- controlReporters containing the values of the controlled quantities (control signals) from the model simulations.
C++ Type:std::vector<ReporterName>
Controllable:No
Description:Reporters containing the values of the controlled quantities (control signals) from the model simulations.
- control_learning_rateLearning rate (relaxation) for the control neural net training.
C++ Type:double
Unit:(no unit assumed)
Controllable:No
Description:Learning rate (relaxation) for the control neural net training.
- critic_learning_rateLearning rate (relaxation) for the emulator training.
C++ Type:double
Unit:(no unit assumed)
Controllable:No
Description:Learning rate (relaxation) for the emulator training.
- log_probabilityReporters containing the log probabilities of the actions taken during the simulations.
C++ Type:std::vector<ReporterName>
Controllable:No
Description:Reporters containing the log probabilities of the actions taken during the simulations.
- num_control_neurons_per_layerNumber of neurons per layer for the control neural network.
C++ Type:std::vector<unsigned int>
Controllable:No
Description:Number of neurons per layer for the control neural network.
- num_critic_neurons_per_layerNumber of neurons per layer in the emulator neural net.
C++ Type:std::vector<unsigned int>
Controllable:No
Description:Number of neurons per layer in the emulator neural net.
- num_epochsNumber of epochs for the training.
C++ Type:unsigned int
Controllable:No
Description:Number of epochs for the training.
- responseReporter values containing the response values from the model.
C++ Type:std::vector<ReporterName>
Controllable:No
Description:Reporter values containing the response values from the model.
- rewardReporter containing the earned time-dependent rewards from the simulation.
C++ Type:ReporterName
Controllable:No
Description:Reporter containing the earned time-dependent rewards from the simulation.
LibtorchDRLControlTrainer
Trains a neural network controller using the Proximal Policy Optimization (PPO) algorithm.
Overview
This object is supposed to train a Deep Reinforcement Learning (DRL) controller using the Proximal Policy Optimization (PPO) algorithm Schulman et al. (2017).
Example Input File Syntax
Input Parameters
- clip_parameter0.2Clip parameter used while clamping the advantage value.
Default:0.2
C++ Type:double
Unit:(no unit assumed)
Controllable:No
Description:Clip parameter used while clamping the advantage value.
- control_activation_functionsrelu The type of activation functions to use in the control neural net. It is either one value or one value per hidden layer.
Default:relu
C++ Type:std::vector<std::string>
Controllable:No
Description:The type of activation functions to use in the control neural net. It is either one value or one value per hidden layer.
- critic_activation_functionsrelu The type of activation functions to use in the emulator neural net. It is either one value or one value per hidden layer.
Default:relu
C++ Type:std::vector<std::string>
Controllable:No
Description:The type of activation functions to use in the emulator neural net. It is either one value or one value per hidden layer.
- decay_factor1Decay factor for calculating the return. This accounts for decreased reward values from the later steps.
Default:1
C++ Type:double
Unit:(no unit assumed)
Controllable:No
Description:Decay factor for calculating the return. This accounts for decreased reward values from the later steps.
- filenameThe name of the file which will be associated with the saved/loaded data.
C++ Type:FileName
Controllable:No
Description:The name of the file which will be associated with the saved/loaded data.
- filename_baseFilename used to output the neural net parameters.
C++ Type:std::string
Controllable:No
Description:Filename used to output the neural net parameters.
- input_timesteps1Number of time steps to use in the input data, if larger than 1, data from the previous timesteps will be used as inputs in the training.
Default:1
C++ Type:unsigned int
Controllable:No
Description:Number of time steps to use in the input data, if larger than 1, data from the previous timesteps will be used as inputs in the training.
- loss_print_frequency0The frequency which is used to print the loss values. If 0, the loss values are not printed.
Default:0
C++ Type:unsigned int
Controllable:No
Description:The frequency which is used to print the loss values. If 0, the loss values are not printed.
- read_from_fileFalseSwitch to read the neural network parameters from a file.
Default:False
C++ Type:bool
Controllable:No
Description:Switch to read the neural network parameters from a file.
- response_scaling_factorsA normalization constant which will be used to divide the response values. This is used for the manipulation of the neural net inputs for better training efficiency.
C++ Type:std::vector<double>
Unit:(no unit assumed)
Controllable:No
Description:A normalization constant which will be used to divide the response values. This is used for the manipulation of the neural net inputs for better training efficiency.
- response_shift_factorsA shift constant which will be used to shift the response values. This is used for the manipulation of the neural net inputs for better training efficiency.
C++ Type:std::vector<double>
Unit:(no unit assumed)
Controllable:No
Description:A shift constant which will be used to shift the response values. This is used for the manipulation of the neural net inputs for better training efficiency.
- seed11Random number generator seed for stochastic optimizers.
Default:11
C++ Type:unsigned int
Controllable:No
Description:Random number generator seed for stochastic optimizers.
- shift_outputsTrueIf we would like to shift the outputs the realign the input-output pairs.
Default:True
C++ Type:bool
Controllable:No
Description:If we would like to shift the outputs the realign the input-output pairs.
- skip_num_rows1Number of rows to ignore from training. We usually skip the 1st row from the reporter since it contains only initial values.
Default:1
C++ Type:unsigned int
Controllable:No
Description:Number of rows to ignore from training. We usually skip the 1st row from the reporter since it contains only initial values.
- standardize_advantageTrueSwitch to enable the shifting and normalization of the advantages in the PPO algorithm.
Default:True
C++ Type:bool
Controllable:No
Description:Switch to enable the shifting and normalization of the advantages in the PPO algorithm.
- update_frequency1Number of transient simulation data to collect for updating the controller neural network.
Default:1
C++ Type:unsigned int
Controllable:No
Description:Number of transient simulation data to collect for updating the controller neural network.
Optional Parameters
- allow_duplicate_execution_on_initialFalseIn the case where this UserObject is depended upon by an initial condition, allow it to be executed twice during the initial setup (once before the IC and again after mesh adaptivity (if applicable).
Default:False
C++ Type:bool
Controllable:No
Description:In the case where this UserObject is depended upon by an initial condition, allow it to be executed twice during the initial setup (once before the IC and again after mesh adaptivity (if applicable).
- execute_onTIMESTEP_ENDThe list of flag(s) indicating when this object should be executed. For a description of each flag, see https://mooseframework.inl.gov/source/interfaces/SetupInterface.html.
Default:TIMESTEP_END
C++ Type:ExecFlagEnum
Controllable:No
Description:The list of flag(s) indicating when this object should be executed. For a description of each flag, see https://mooseframework.inl.gov/source/interfaces/SetupInterface.html.
- execution_order_group0Execution order groups are executed in increasing order (e.g., the lowest number is executed first). Note that negative group numbers may be used to execute groups before the default (0) group. Please refer to the user object documentation for ordering of user object execution within a group.
Default:0
C++ Type:int
Controllable:No
Description:Execution order groups are executed in increasing order (e.g., the lowest number is executed first). Note that negative group numbers may be used to execute groups before the default (0) group. Please refer to the user object documentation for ordering of user object execution within a group.
- force_postauxFalseForces the UserObject to be executed in POSTAUX
Default:False
C++ Type:bool
Controllable:No
Description:Forces the UserObject to be executed in POSTAUX
- force_preauxFalseForces the UserObject to be executed in PREAUX
Default:False
C++ Type:bool
Controllable:No
Description:Forces the UserObject to be executed in PREAUX
- force_preicFalseForces the UserObject to be executed in PREIC during initial setup
Default:False
C++ Type:bool
Controllable:No
Description:Forces the UserObject to be executed in PREIC during initial setup
Execution Scheduling Parameters
- control_tagsAdds user-defined labels for accessing object parameters via control logic.
C++ Type:std::vector<std::string>
Controllable:No
Description:Adds user-defined labels for accessing object parameters via control logic.
- enableTrueSet the enabled status of the MooseObject.
Default:True
C++ Type:bool
Controllable:Yes
Description:Set the enabled status of the MooseObject.
- use_displaced_meshFalseWhether or not this object should use the displaced mesh for computation. Note that in the case this is true but no displacements are provided in the Mesh block the undisplaced mesh will still be used.
Default:False
C++ Type:bool
Controllable:No
Description:Whether or not this object should use the displaced mesh for computation. Note that in the case this is true but no displacements are provided in the Mesh block the undisplaced mesh will still be used.
Advanced Parameters
- prop_getter_suffixAn optional suffix parameter that can be appended to any attempt to retrieve/get material properties. The suffix will be prepended with a '_' character.
C++ Type:MaterialPropertyName
Unit:(no unit assumed)
Controllable:No
Description:An optional suffix parameter that can be appended to any attempt to retrieve/get material properties. The suffix will be prepended with a '_' character.
- use_interpolated_stateFalseFor the old and older state use projected material properties interpolated at the quadrature points. To set up projection use the ProjectedStatefulMaterialStorageAction.
Default:False
C++ Type:bool
Controllable:No
Description:For the old and older state use projected material properties interpolated at the quadrature points. To set up projection use the ProjectedStatefulMaterialStorageAction.
Material Property Retrieval Parameters
Input Files
References
- John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov.
Proximal policy optimization algorithms.
arXiv preprint arXiv:1707.06347, 2017.[BibTeX]