Covariance System
Overview
Some surrogate models allow for a covariance function (often referred to as kernel functions or simply kernels) to project the input parameters into a different input space. The Gaussian process surrogate is an example of such a surrogate. These covariance functions can be defined in the [Covariance]
block.
Creating a Covariance Function
A covariance function is created by inheriting from CovarainceFunctionBase
and overriding the methods in the base class.
Using a Covariance Function
In the GaussianProcessHandler
The GaussianProcessHandler is a class which incorporates the necessary data structures and functions to create, train, and use Gaussian Processes. One of the most important members of this handler class is the covariance function:
CovarianceFunctionBase * _covariance_function = nullptr;
(modules/stochastic_tools/include/utils/GaussianProcessHandler.h)The covariance function can be initialized in the handler by following the examples given in source description. Objects like GaussianProcessTrainer or GaussianProcess can then access the covariance function through the handler class.
CovarianceInterface
Alternatively, by inheriting from CovarianceInterface
, the child classes can easily fetch covariance functions using the helper functions. Good examples are the GaussianProcessTrainer and GaussianProcess which utilize the helper functions to link an input covariance function to the GaussianProcessHandler:
_gp_handler.initialize(
getCovarianceFunctionByName(parameters.get<UserObjectName>("covariance_function")),
tune_parameters,
lower_bounds,
upper_bounds);
(modules/stochastic_tools/src/surrogates/GaussianProcessTrainer.C)In a Surrogate
If the surrogate loads the training data from a file, the LoadCovarianceDataAction automatically reconstructs the covariance object used in the training phase, and calls the surrogate setupCovariance()
method to make the linkage. This recreation is done by storing the hyper parameter map in the Gaussian Process handler of the trainer for use in the surrogate.
Example Input File Syntax
[Covariance]
[covar]
type = SquaredExponentialCovariance
signal_variance = 1 #Use a signal variance of 1 in the kernel
noise_variance = 1e-6 #A small amount of noise can help with numerical stability
length_factor = '0.38971 0.38971' #Select a length factor for each parameter (k and q)
[]
[]
(modules/stochastic_tools/test/tests/surrogates/gaussian_process/GP_squared_exponential_training.i)Available Objects
- Stochastic Tools App
- ExponentialCovarianceExponential covariance function.
- MaternHalfIntCovarianceMatern half-integer covariance function.
- SquaredExponentialCovarianceSquared Exponential covariance function.
Available Actions
- Stochastic Tools App
- AddCovarianceActionAdds Covariance objects contained within the
[Trainers]
and[Surrogates]
input blocks.
(modules/stochastic_tools/include/utils/GaussianProcessHandler.h)
// This file is part of the MOOSE framework
// https://www.mooseframework.org
//
// All rights reserved, see COPYRIGHT for full restrictions
// https://github.com/idaholab/moose/blob/master/COPYRIGHT
//
// Licensed under LGPL 2.1, please see LICENSE for details
// https://www.gnu.org/licenses/lgpl-2.1.html
#pragma once
#include "Standardizer.h"
#include <Eigen/Dense>
#include "CovarianceFunctionBase.h"
namespace StochasticTools
{
/**
* Utility class dedicated to hold structures and functions commont to
* Gaussian Processes. It can be used to standardize parameters, manipulate
* covariance data and compute additional stored matrices.
*/
class GaussianProcessHandler
{
public:
GaussianProcessHandler();
/**
* Initializes the most important structures in the Gaussian Process: the
* covariance function and a tuning map which is used if the user requires
* parameter tuning.
* @param covariance_function Pointer to the covariance function that
* needs to be used for the Gaussian Process.
* @param params_to_tune List of parameters which need to be tuned.
* @param min List of lower bounds for the parameter tuning.
* @param max List of upper bounds for parameter tuning.
*/
void initialize(CovarianceFunctionBase * covariance_function,
const std::vector<std::string> params_to_tune,
std::vector<Real> min = std::vector<Real>(),
std::vector<Real> max = std::vector<Real>());
/**
* Sets up the covariance matrix given data and optimization options.
* @param training_params The training parameter values (x values) for the
* covariance matrix.
* @param training_data The training data (y values) for the inversion of the
* covariance matrix.
* @param opt_type The optimization algorithm for the hyperparameters.
* @param tao_options Additional options if TAO is used for parameter optimization.
* @param show_tao Switch to show details of TAO optimization.
*/
void setupCovarianceMatrix(const RealEigenMatrix & training_params,
const RealEigenMatrix & training_data,
MooseEnum opt_type,
std::string tao_options = "",
bool show_tao = false);
/**
* Sets up the Cholesky decomposition and inverse action of the covariance matrix.
* @param input The vector/matrix which right multiples the inverse of the covariance matrix.
*/
void setupStoredMatrices(const RealEigenMatrix & input);
/**
* Finds and links the covariance function to this object. Used mainly in the
* covariance data action.
* @param covariance_function Pointer to the covariance function that
* needs to be used for the Gaussian Process.
*/
void linkCovarianceFunction(CovarianceFunctionBase * covariance_function);
/**
* Sets up the tuning map which is used if the user requires parameter tuning.
* @param params_to_tune List of parameters which need to be tuned.
* @param min List of lower bounds for the parameter tuning.
* @param max List of upper bounds for parameter tuning.
*/
void generateTuningMap(const std::vector<std::string> params_to_tune,
std::vector<Real> min = std::vector<Real>(),
std::vector<Real> max = std::vector<Real>());
/**
* Standardizes the vector of input parameters (x values).
* @param parameters The vector/matrix of input data.
* @param keep_moments If previously computed or new moments are to be used.
*/
void standardizeParameters(RealEigenMatrix & parameters, bool keep_moments = false);
/**
* Standardizes the vector of responses (y values).
* @param data The vector/matrix of input data.
* @param keep_moments If previously computed or new moments are to be used.
*/
void standardizeData(RealEigenMatrix & data, bool keep_moments = false);
/**
* Tune the hyper parameters in the covariance function using PETSc-TAO.
* @param training_params The training parameter values (x values) for the
* covariance matrix.
* @param training_data The training data (y values) for the inversion of the
* covariance matrix.
* @param tao_options Additional options for TAO.
* @param show_tao Switch to show details of TAO optimization.
*/
PetscErrorCode tuneHyperParamsTAO(const RealEigenMatrix & training_params,
const RealEigenMatrix & training_data,
std::string tao_options = "",
bool show_tao = false);
/// Used to form initial guesses in the TAO optimization routines
PetscErrorCode formInitialGuessTAO(Vec theta_vec);
/// Build the bounds for the hyper parameter optimization with TAO
void buildHyperParamBoundsTAO(libMesh::PetscVector<Number> & theta_l,
libMesh::PetscVector<Number> & theta_u) const;
// Wrapper for PETSc function callback
static PetscErrorCode
formFunctionGradientWrapper(Tao tao, Vec theta, PetscReal * f, Vec Grad, void * ptr);
// Computes Gradient of the loss function
void formFunctionGradient(Tao tao, Vec theta, PetscReal * f, Vec Grad);
/// Function used to convert the hyperparameter maps in this object to
/// Petsc vectors
void mapToPetscVec(
const std::unordered_map<std::string, std::tuple<unsigned int, unsigned int, Real, Real>> &
tuning_data,
const std::unordered_map<std::string, Real> & scalar_map,
const std::unordered_map<std::string, std::vector<Real>> & vector_map,
libMesh::PetscVector<Number> & petsc_vec);
/// Function used to convert the PETSc vectors back to hyperparameter maps
void petscVecToMap(
const std::unordered_map<std::string, std::tuple<unsigned int, unsigned int, Real, Real>> &
tuning_data,
std::unordered_map<std::string, Real> & scalar_map,
std::unordered_map<std::string, std::vector<Real>> & vector_map,
const libMesh::PetscVector<Number> & petsc_vec);
/// @{
/**
* Get constant reference to the contained structures
*/
const StochasticTools::Standardizer & getParamStandardizer() const { return _param_standardizer; }
const StochasticTools::Standardizer & getDataStandardizer() const { return _data_standardizer; }
const RealEigenMatrix & getK() const { return _K; }
const RealEigenMatrix & getKResultsSolve() const { return _K_results_solve; }
const Eigen::LLT<RealEigenMatrix> & getKCholeskyDecomp() const { return _K_cho_decomp; }
const CovarianceFunctionBase & getCovarFunction() const { return *_covariance_function; }
const CovarianceFunctionBase * getCovarFunctionPtr() const { return _covariance_function; }
const std::string & getCovarType() const { return _covar_type; }
const unsigned int & getNumTunableParams() const { return _num_tunable; }
const std::unordered_map<std::string, Real> & getHyperParamMap() const { return _hyperparam_map; }
const std::unordered_map<std::string, std::vector<Real>> & getHyperParamVectorMap() const
{
return _hyperparam_vec_map;
}
///@}
/// @{
/**
* Get non-constant reference to the contained structures (if they need to be modified from the
* utside)
*/
StochasticTools::Standardizer & paramStandardizer() { return _param_standardizer; }
StochasticTools::Standardizer & dataStandardizer() { return _data_standardizer; }
RealEigenMatrix & K() { return _K; }
RealEigenMatrix & KResultsSolve() { return _K_results_solve; }
Eigen::LLT<RealEigenMatrix> & KCholeskyDecomp() { return _K_cho_decomp; }
CovarianceFunctionBase * covarFunctionPtr() { return _covariance_function; }
CovarianceFunctionBase & covarFunction() { return *_covariance_function; }
std::string & covarType() { return _covar_type; }
std::unordered_map<std::string, std::tuple<unsigned int, unsigned int, Real, Real>> & tuningData()
{
return _tuning_data;
}
std::unordered_map<std::string, Real> & hyperparamMap() { return _hyperparam_map; }
std::unordered_map<std::string, std::vector<Real>> & hyperparamVectorMap()
{
return _hyperparam_vec_map;
}
///@}
protected:
/// Covariance function object
CovarianceFunctionBase * _covariance_function = nullptr;
/// Contains tuning inforation. Index of hyperparam, size, and min/max bounds
std::unordered_map<std::string, std::tuple<unsigned int, unsigned int, Real, Real>> _tuning_data;
/// Number of tunable hyperparameters
unsigned int _num_tunable;
/// Type of covariance function used for this surrogate
std::string _covar_type;
/// Tao Communicator
Parallel::Communicator _tao_comm;
/// Scalar hyperparameters. Stored for use in surrogate
std::unordered_map<std::string, Real> _hyperparam_map;
/// Vector hyperparameters. Stored for use in surrogate
std::unordered_map<std::string, std::vector<Real>> _hyperparam_vec_map;
/// Standardizer for use with params (x)
StochasticTools::Standardizer _param_standardizer;
/// Standardizer for use with data (y)
StochasticTools::Standardizer _data_standardizer;
/// An _n_sample by _n_sample covariance matrix constructed from the selected kernel function
RealEigenMatrix _K;
/// A solve of Ax=b via Cholesky.
RealEigenMatrix _K_results_solve;
/// Cholesky decomposition Eigen object
Eigen::LLT<RealEigenMatrix> _K_cho_decomp;
/// Paramaters (x) used for training, along with statistics
const RealEigenMatrix * _training_params;
/// Data (y) used for training
const RealEigenMatrix * _training_data;
};
} // StochasticTools namespac
template <>
void dataStore(std::ostream & stream, Eigen::LLT<RealEigenMatrix> & decomp, void * context);
template <>
void dataLoad(std::istream & stream, Eigen::LLT<RealEigenMatrix> & decomp, void * context);
template <>
void dataStore(std::ostream & stream,
StochasticTools::GaussianProcessHandler & gp_utils,
void * context);
template <>
void
dataLoad(std::istream & stream, StochasticTools::GaussianProcessHandler & gp_utils, void * context);
(modules/stochastic_tools/src/surrogates/GaussianProcessTrainer.C)
// This file is part of the MOOSE framework
// https://www.mooseframework.org
//
// All rights reserved, see COPYRIGHT for full restrictions
// https://github.com/idaholab/moose/blob/master/COPYRIGHT
//
// Licensed under LGPL 2.1, please see LICENSE for details
// https://www.gnu.org/licenses/lgpl-2.1.html
#include "GaussianProcessTrainer.h"
#include "Sampler.h"
#include "CartesianProduct.h"
#include <petsctao.h>
#include <petscdmda.h>
#include "libmesh/petsc_vector.h"
#include "libmesh/petsc_matrix.h"
#include <math.h>
registerMooseObject("StochasticToolsApp", GaussianProcessTrainer);
InputParameters
GaussianProcessTrainer::validParams()
{
InputParameters params = SurrogateTrainer::validParams();
params.addClassDescription(
"Provides data preperation and training for a Gaussian Process surrogate model.");
params.addRequiredParam<ReporterName>(
"response", "Reporter value of response results, can be vpp with <vpp_name>/<vector_name>.");
params.addParam<std::vector<ReporterName>>(
"predictors",
std::vector<ReporterName>(),
"Reporter values used as the independent random variables, If 'predictors' and "
"'predictor_cols' are both empty, all sampler columns are used.");
params.addParam<std::vector<unsigned int>>(
"predictor_cols",
std::vector<unsigned int>(),
"Sampler columns used as the independent random variables, If 'predictors' and "
"'predictor_cols' are both empty, all sampler columns are used.");
params.addRequiredParam<UserObjectName>("covariance_function", "Name of covariance function.");
params.addParam<bool>(
"standardize_params", true, "Standardize (center and scale) training parameters (x values)");
params.addParam<bool>(
"standardize_data", true, "Standardize (center and scale) training data (y values)");
// Already preparing to use ADAM here
MooseEnum tuning_type("tao none", "none");
params.addParam<MooseEnum>(
"tuning_algorithm", tuning_type, "Hyper parameter optimizaton algorithm");
params.addParam<std::string>("tao_options",
"Command line options for PETSc/TAO hyperparameter optimization");
params.addParam<bool>("show_tao", "Switch to show TAO solver results");
params.addParam<std::vector<std::string>>("tune_parameters",
"Select hyperparameters to be tuned");
params.addParam<std::vector<Real>>("tuning_min", "Minimum allowable tuning value");
params.addParam<std::vector<Real>>("tuning_max", "Maximum allowable tuning value");
return params;
}
GaussianProcessTrainer::GaussianProcessTrainer(const InputParameters & parameters)
: SurrogateTrainer(parameters),
CovarianceInterface(parameters),
_gp_handler(declareModelData<StochasticTools::GaussianProcessHandler>("_gp_handler")),
_training_params(declareModelData<RealEigenMatrix>("_training_params")),
_standardize_params(getParam<bool>("standardize_params")),
_standardize_data(getParam<bool>("standardize_data")),
_do_tuning(isParamValid("tune_parameters")),
_tuning_algorithm(getParam<MooseEnum>("tuning_algorithm")),
_sampler_row(getSamplerData()),
_rval(getTrainingData<Real>(getParam<ReporterName>("response"))),
_pvals(getParam<std::vector<ReporterName>>("predictors").size()),
_pcols(getParam<std::vector<unsigned int>>("predictor_cols")),
_n_params((_pvals.empty() && _pcols.empty()) ? _sampler.getNumberOfCols()
: (_pvals.size() + _pcols.size()))
{
const auto & pnames = getParam<std::vector<ReporterName>>("predictors");
for (unsigned int i = 0; i < pnames.size(); ++i)
_pvals[i] = &getTrainingData<Real>(pnames[i]);
// If predictors and predictor_cols are empty, use all sampler columns
if (_pvals.empty() && _pcols.empty())
{
_pcols.resize(_sampler.getNumberOfCols());
std::iota(_pcols.begin(), _pcols.end(), 0);
}
if (_do_tuning && _tuning_algorithm == "none")
paramError("tuning_algorithm",
"No tuning algorithm is selected for the hyper parameter optimization!");
std::vector<std::string> tune_parameters(getParam<std::vector<std::string>>("tune_parameters"));
// Error Checking
if (isParamValid("tuning_min") &&
(getParam<std::vector<Real>>("tuning_min").size() != tune_parameters.size()))
mooseError("tuning_min size does not match tune_parameters");
if (isParamValid("tuning_max") &&
(getParam<std::vector<Real>>("tuning_max").size() != tune_parameters.size()))
mooseError("tuning_max size does not match tune_parameters");
std::vector<Real> lower_bounds, upper_bounds;
if (isParamValid("tuning_min"))
lower_bounds = getParam<std::vector<Real>>("tuning_min");
if (isParamValid("tuning_max"))
upper_bounds = getParam<std::vector<Real>>("tuning_max");
_gp_handler.initialize(
getCovarianceFunctionByName(parameters.get<UserObjectName>("covariance_function")),
tune_parameters,
lower_bounds,
upper_bounds);
}
void
GaussianProcessTrainer::preTrain()
{
_training_params.setZero(_sampler.getNumberOfRows(), _n_params);
_training_data.setZero(_sampler.getNumberOfRows(), 1);
}
void
GaussianProcessTrainer::train()
{
unsigned int d = 0;
for (const auto & val : _pvals)
_training_params(_row, d++) = *val;
for (const auto & col : _pcols)
_training_params(_row, d++) = _sampler_row[col];
// Loading result data from response reporter
_training_data(_row, 0) = _rval;
}
void
GaussianProcessTrainer::postTrain()
{
for (unsigned int ii = 0; ii < _training_params.rows(); ++ii)
{
for (unsigned int jj = 0; jj < _training_params.cols(); ++jj)
gatherSum(_training_params(ii, jj));
gatherSum(_training_data(ii, 0));
}
// Standardize (center and scale) training params
if (_standardize_params)
_gp_handler.standardizeParameters(_training_params);
// if not standardizing data set mean=0, std=1 for use in surrogate
else
_gp_handler.paramStandardizer().set(0, 1, _n_params);
// Standardize (center and scale) training data
if (_standardize_data)
_gp_handler.standardizeData(_training_data);
// if not standardizing data set mean=0, std=1 for use in surrogate
else
_gp_handler.dataStandardizer().set(0, 1, _n_params);
// Setup the covariance
_gp_handler.setupCovarianceMatrix(
_training_params,
_training_data,
_tuning_algorithm,
isParamValid("tao_options") ? getParam<std::string>("tao_options") : "",
isParamValid("show_tao") ? getParam<bool>("show_tao") : false);
}
(modules/stochastic_tools/test/tests/surrogates/gaussian_process/GP_squared_exponential_training.i)
[StochasticTools]
[]
[Distributions]
[k_dist]
type = Uniform
lower_bound = 1
upper_bound = 10
[]
[q_dist]
type = Uniform
lower_bound = 9000
upper_bound = 11000
[]
[]
[Samplers]
[train_sample]
type = MonteCarlo
num_rows = 10
distributions = 'k_dist q_dist'
execute_on = PRE_MULTIAPP_SETUP
[]
[]
[MultiApps]
[sub]
type = SamplerFullSolveMultiApp
input_files = sub.i
sampler = train_sample
[]
[]
[Controls]
[cmdline]
type = MultiAppCommandLineControl
multi_app = sub
sampler = train_sample
param_names = 'Materials/conductivity/prop_values Kernels/source/value'
[]
[]
[Transfers]
[data]
type = SamplerReporterTransfer
from_multi_app = sub
sampler = train_sample
stochastic_reporter = results
from_reporter = 'avg/value'
[]
[]
[Reporters]
[results]
type = StochasticReporter
parallel_type = ROOT
[]
[]
[Trainers]
[GP_avg_trainer]
type = GaussianProcessTrainer
execute_on = timestep_end
covariance_function = 'covar' #Choose a squared exponential for the kernel
standardize_params = 'true' #Center and scale the training params
standardize_data = 'true' #Center and scale the training data
sampler = train_sample
response = results/data:avg:value
[]
[]
[Covariance]
[covar]
type=SquaredExponentialCovariance
signal_variance = 1 #Use a signal variance of 1 in the kernel
noise_variance = 1e-6 #A small amount of noise can help with numerical stability
length_factor = '0.38971 0.38971' #Select a length factor for each parameter (k and q)
[]
[]
[Outputs]
file_base = gauss_process_training
[out]
type = SurrogateTrainerOutput
trainers = 'GP_avg_trainer'
execute_on = FINAL
[]
[]