Trainers System
Overview
Objects within the [Trainers]
block are derived from SurrogateTrainer
and are designed for creating training data for use with a model (see Surrogates System).
Creating a SurrogateTrainer
To create a trainer the new object should inherit from SurrogateTrainer
, which is derived from GeneralUserObject. SurrogateTrainer
overrides the execute()
function to loop through the rows of a given sampler, specified by the "sampler" parameter:
void
SurrogateTrainer::execute()
{
checkIntegrity();
_row = _sampler.getLocalRowBegin();
_local_row = 0;
preTrain();
for (_row = _sampler.getLocalRowBegin(); _row < _sampler.getLocalRowEnd(); ++_row)
{
// Need to do this manually in order to keep the iterators valid
const std::vector<Real> data = _sampler.getNextLocalRow();
for (unsigned int i = 0; i < _row_data.size(); ++i)
_row_data[i] = data[i];
// Set training data
for (auto & pair : _training_data)
pair.second->setCurrentIndex((pair.second->isDistributed() ? _local_row : _row));
train();
_local_row++;
}
postTrain();
}
(modules/stochastic_tools/src/surrogates/SurrogateTrainer.C)The method will execute once per execution flag (see SetupInterface (execute_on)) on each processor. There are three virtual functions that derived class can and should override:
/*
* Setup function called before sampler loop
*/
virtual void preTrain() {}
/*
* Function needed to be overried, called during sampler loop
*/
virtual void train() {}
/*
* Function called after sampler loop, used for mpi communication mainly
*/
virtual void postTrain() {}
(modules/stochastic_tools/include/surrogates/SurrogateTrainer.h)preTrain()
is called before the sampler loop and is typically used for resizing variables for the given number of data points.train()
is called within the sampler loop where member variables_local_row
,_row
, and those declared withgetTrainingData
are updated.postTrain()
is called after the sampler loop and is typically used for MPI communication.
Gathering Training Data
In order to ease the of gathering the required data needed for training, SurrogateTrainer
includes API to get reporter data which takes care of the necessary size checks and distributed data indexing. The idea behind this is to emulate the element loop behavior in other MOOSE objects. For instance, in a kernel, the value of _u corresponds to the solution in an element. Here data referenced with getTrainingData
will correspond to the the value of the data in a sampler row. The returned reference is to be used in the train()
function. There are four functions that derived classes can call to gather training data:
/*
* Get a reference to training data given a reporter name
*/
template <typename T>
const T & getTrainingData(const ReporterName & rname);
/*
* Get a reference to the sampler row data
*/
const std::vector<Real> & getSamplerData() const { return _row_data; };
(modules/stochastic_tools/include/surrogates/SurrogateTrainer.h)getTrainingData<T>(const ReporterName & rname)
will get a vector of training data from a reporter value of typestd::vector<T>
, whose name is defined byrname
.getSamplerData()
will simply return a vector of the sampler row.
Declaring Training Data
Model data must be declare in the object constructor using the declareModelData
methods, which are defined as follows. The desired type is provided as the template argument (T
) and name to the data is the first input parameter. The second option, if provided, is the initial value for the training data. The name provided is arbitrary, but is used by the model object(s) designed to work with the training data (see Surrogates System).
template <typename T>
T & declareModelData(const std::string & data_name);
template <typename T>
T & declareModelData(const std::string & data_name, const T & value);
(modules/stochastic_tools/include/surrogates/SurrogateTrainer.h)These methods return a reference to the desired type that should be populated in the aforementioned train()
method. For example, in the PolynomialChaosTrainer trainer object a scalar value, "order", is stored stored by declaring a reference to the desired type in the header.
const unsigned int & _order;
(modules/stochastic_tools/include/surrogates/PolynomialChaosTrainer.h)Within the source the declared references are initialized with a declare method that includes data initialization.
_order(declareModelData<unsigned int>("_order", getParam<unsigned int>("order"))),
(modules/stochastic_tools/src/surrogates/PolynomialChaosTrainer.C)The training data system leverages the Restartable within MOOSE. As such, the data store can be of an arbitrary type and is automatically used for restarting simulations.
Output Mdoel Data
Training model data can be output to a binary file using the SurrogateTrainerOutput object.
Example Input File Syntax
The following input file snippet adds a PolynomialChaosTrainer object for training. Please refer to the documentation on the individual models for more details.
[Trainers]
[poly_chaos]
type = PolynomialChaosTrainer
execute_on = timestep_end
order = 5
distributions = 'D_dist S_dist'
sampler = sample
response = storage/data:avg:value
[]
[]
(modules/stochastic_tools/test/tests/surrogates/poly_chaos/master_2d_mc.i)Available Objects
- Stochastic Tools App
- GaussianProcessTrainerProvides data preperation and training for a Gaussian Process surrogate model.
- NearestPointTrainerLoops over and saves sample values for NearestPointSurrogate.
- PODReducedBasisTrainerComputes the reduced subspace plus the reduced operators for POD-RB surrogate.
- PolynomialChaosTrainerComputes and evaluates polynomial chaos surrogate model.
- PolynomialRegressionTrainerComputes coefficients for polynomial regession model.
Available Actions
- Stochastic Tools App
- AddSurrogateActionAdds SurrogateTrainer and SurrogateModel objects contained within the
[Trainers]
and[Surrogates]
input blocks.