GaussianProcessTrainer

"Gaussian Processes for Machine Learning" Rasmussen and Williams (2005) provides a well written discussion of Gaussian Processes, and its reading is highly encouraged. Chapters 1-5 cover the topics presented here with far greater detail, depth, and rigor.

The documentation here is meant to give some practical insight for users to begin creating surrogate models with Gaussian Processes.

Given a set of inputs $var element = document.getElementById("moose-equation-be8f06c0-b067-448e-a71d-d51dd79f834a");katex.render("X=\\lbrace{\\vec{x}_1, \\cdots, \\vec{x}_m \\rbrace}", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ for which we have made observations of the correspond outputs $var element = document.getElementById("moose-equation-d39124c9-1796-43fe-857c-31f22ccef84f");katex.render("Y=\\lbrace{\\vec{y}_1, \\cdots, \\vec{y}_m \\rbrace}", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ using the system ( $var element = document.getElementById("moose-equation-5f299f18-adfb-4b46-8583-aaabc0329060");katex.render("Y = f(X)", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ ). Given another set of inputs $var element = document.getElementById("moose-equation-18b9c4ba-c740-4d6d-9324-4b24673102bc");katex.render("X_\\star=\\lbrace{\\vec{x}_{\\star 1}, \\cdots, \\vec{x}_{\\star n} \\rbrace}", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ we wish to predict the associated outputs $var element = document.getElementById("moose-equation-a0cfb55f-1b36-44ab-b625-0630962c2fc8");katex.render("Y_\\star=f(X_\\star)", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ without evaluation of $var element = document.getElementById("moose-equation-f87d5ac1-8494-40d9-9414-968bc1c2c383");katex.render("f(X_\\star)", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ , which is presumed costly.

Parameter Covariance

In overly simplistic terms, Gaussian Process Modeling is driven by the idea that trials which are "close" in their input parameter space will be "close" in their output space. Closeness in the parameter space is driven by the covariance function $var element = document.getElementById("moose-equation-4e40df77-939a-40a0-9763-a1faffc0355c");katex.render("k(\\vec{x},\\vec{x'})", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ (also called a kernel function, not to be confused with a MOOSE Framework kernel). This covariance function is used to generate a covariance matrix between the complete set of parameters $var element = document.getElementById("moose-equation-de20eb50-01c3-416d-8d96-efbbf8d4fda4");katex.render("X \\cup X_\\star = \\lbrace{\\vec{x}_1, \\cdots, \\vec{x}_m, \\vec{x}_{\\star 1}, \\cdots, \\vec{x}_{\\star n} \\rbrace}", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ , which can then be interpreted block-wise as various covariance matrices between $var element = document.getElementById("moose-equation-5921fb56-143b-4843-9cf1-0e15dcfd0a9a");katex.render("X", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ and $var element = document.getElementById("moose-equation-3811117f-a659-4c38-9246-702ed0d6cae7");katex.render("X_\\star", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ .

var element = document.getElementById("moose-equation-9754fdbe-b95a-4ae0-8a8f-5b13c266c8b1");katex.render("\\begin{aligned} \\mathbf{K}(X \\cup X_\\star,X \\cup X_\\star) & = \\left[ \\begin{array}{ccc|ccc} k(\\vec{x}_1,\\vec{x}_1) & \\cdots & k(\\vec{x}_1,\\vec{x}_m) & k(\\vec{x}_{1},\\vec{x}_{\\star 1}) & \\cdots & k(\\vec{x}_{1},\\vec{x}_{\\star n}) \\\\ \\vdots & & \\vdots & \\vdots & & \\vdots \\\\ (\\vec{x}_m,\\vec{x}_1) & \\cdots & k(\\vec{x}_m,\\vec{x}_m) & k(\\vec{x}_{m},\\vec{x}_{\\star 1}) & \\cdots & k(\\vec{x}_{m},\\vec{x}_{\\star n}) \\\\ \\hline k(\\vec{x}_{\\star 1},\\vec{x}_{1}) & \\cdots & k(\\vec{x}_{\\star 1},\\vec{x}_{m}) & k(\\vec{x}_{\\star 1},\\vec{x}_{\\star 1}) & \\cdots & k(\\vec{x}_{\\star 1},\\vec{x}_{\\star n}) \\\\ \\vdots & & \\vdots & \\vdots & & \\vdots \\\\ k(\\vec{x}_{\\star n},\\vec{x}_{1}) & \\cdots & k(\\vec{x}_{\\star n},\\vec{x}_{m}) & k(\\vec{x}_{\\star n},\\vec{x}_{\\star 1}) & \\cdots & k(\\vec{x}_{\\star n},\\vec{x}_{\\star n}) \\end{array} \\right] \\\\ & =\\left[ \\begin{array}{c|c} \\mathbf{K}(X,X) & \\mathbf{K}(X,X_\\star) \\\\ \\hline \\mathbf{K}(X_\\star,X) & \\mathbf{K}(X_\\star,X_\\star) \\end{array} \\right] \\\\ & =\\left[ \\begin{array}{c|c} \\mathbf{K} & \\mathbf{K}_\\star \\\\ \\hline \\mathbf{K}_\\star^T & \\mathbf{K}_{\\star \\star} \\end{array} \\right] \\end{aligned}", element, {displayMode:true,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});

The Gaussian Process Model consists of an infinite collection of functions, all of which agree with the training/observation data. Importantly the collection has closed forms for 2nd order statistics (mean and variance). When used as a surrogate, the nominal value is chosen to be the mean value. The method can be broken down into two step: definition of the prior distribution then conditioning on observed data.

Gaussian processes

A Gaussian Process is a (potentially infinite) collection of random variables, such that the joint distribution of every finite selection of random variables from the collection is a Gaussian distribution.

var element = document.getElementById("moose-equation-ee85b367-1228-416b-b809-277df830c452");katex.render("\\mathcal{GP}(\\mu(\\vec{x}),k(\\vec{x},\\vec{x'})) ", element, {displayMode:true,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});

In an analogous way that a multivariate Gaussian is completely defined by its mean vector and its covariance matrix, a Gaussian Process is completely defined by its mean function and covariance function.

The (potentially) infinite number of random variables within the Gaussian Process correspond to the (potentially) infinite points in the parameter space our surrogate can be evaluated at.

Prior distribution:

We assume the observations (both training and testing) are pulled from an $var element = document.getElementById("moose-equation-bad5dadf-3392-4c1f-adaf-d1d4f7378729");katex.render("m+n", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ multivariate Gaussian distribution. The covariance matrix $var element = document.getElementById("moose-equation-ee3b9595-10e4-4576-b38f-8bf5e875fafb");katex.render("\\Sigma", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ is the result of the choice of covariance function.

var element = document.getElementById("moose-equation-60f3169d-c0ad-4f27-9244-49dc08b50bb3");katex.render("Y \\cup Y_\\star \\sim \\mathcal{N}(\\mu,\\Sigma)", element, {displayMode:true,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});

Note that $var element = document.getElementById("moose-equation-59d8b360-c9fb-4f2c-8ae8-dadc49dd8149");katex.render("\\mu", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ and $var element = document.getElementById("moose-equation-8027174b-12cc-4093-8a65-dc94014d59a3");katex.render("\\Sigma", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ are a vector and matrix respectively, and are a result of the mean and covariance functions applied to the sample points.

Zero Mean Assumption: Discussions of Gaussian Process are typically presented under assumption that $var element = document.getElementById("moose-equation-a04ae3f3-5e60-4740-a087-6bdc636ca078");katex.render("\\mu=0", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ . This occurs without loss of generality since any sample can be made $var element = document.getElementById("moose-equation-97f4c35d-9f3e-447f-912e-9c5277fe9601");katex.render("\\mu=0", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ by subtracting the sample mean (or a variety of other preprocessing options). Note that in a training\testing paradigm, the testing data $var element = document.getElementById("moose-equation-abdb0c33-bef1-499d-b8e5-71550453cf74");katex.render("Y_\\star", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ is unknown, so determination of what to use as $var element = document.getElementById("moose-equation-8d224b37-90d8-4e07-893d-4a56d599f4ca");katex.render("\\mu", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ is based on the information from the training data $var element = document.getElementById("moose-equation-7ddb7c2e-6ac5-4401-9989-e0e2c0d72586");katex.render("Y", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ (or some other prior assumption).

Conditioning:

With the prior formed as above, conditioning on the available training data $var element = document.getElementById("moose-equation-e950b927-645e-43f2-b06b-a126b8deff74");katex.render("Y", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ is performed. This alters the mean and variance to new values $var element = document.getElementById("moose-equation-621094d5-8c35-41cb-b10d-c549d6f934f7");katex.render("\\mu_\\star", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ and $var element = document.getElementById("moose-equation-283c8bb0-c067-4237-a30a-4d469141e63b");katex.render("\\Sigma_\\star", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ , restricting the set of possible functions which agree with the training data.

var element = document.getElementById("moose-equation-510f1272-d5b5-4abd-9bc8-87893339e8fa");katex.render("\\begin{aligned} \\mu_\\star &= \\mu + \\mathbf{K}_\\star \\mathbf{K}^{-1}(Y-\\mu) \\\\ \\Sigma_\\star &= \\mathbf{K}_{\\star \\star} - \\mathbf{K}_\\star^T \\mathbf{K}^{-1} \\mathbf{K}_\\star \\end{aligned}", element, {displayMode:true,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});

var element = document.getElementById("moose-equation-6db642a2-a25e-4457-a68a-ec7fbafd5b5c");katex.render("Y_\\star \\sim \\mathcal{N}(\\mu_\\star ,\\Sigma_\\star)", element, {displayMode:true,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});

When used as a surrogate, the nominal value is typically taken as the mean value, with $var element = document.getElementById("moose-equation-3ce69236-0d11-4184-8039-2546f809893f");katex.render("diag(\\Sigma_\\star)", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ providing variances which can be used to generate confidence intervals.

Common Hyperparameters

While the only apparent decision in the above formulation is the choice of covariance function, most covariance functions will contain hyperparameters of some form which need to be selected in some manner. While each covariance function will have its own set of hyperparameters, a few hyperparameters of specific forms are present in many common covariance functions.

Length Factor $var element = document.getElementById("moose-equation-ff6bcbf3-0920-4279-8ac9-41ef6670368f");katex.render("\\ell", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ or $var element = document.getElementById("moose-equation-766f1750-8ef8-4f1b-82d2-0adfd785686d");katex.render("\\vec{\\ell}", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$

Frequently Kernels consider the distance between two input parameters $var element = document.getElementById("moose-equation-3528595f-ef81-466c-87b4-d9f3f36c1a73");katex.render("\\vec{x}", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ and $var element = document.getElementById("moose-equation-27fce0c6-f395-4fa3-9d4e-d9d8c6312aba");katex.render("\\vec{x}^\\prime", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ . For system of only a single parameter this distance often takes the form of

var element = document.getElementById("moose-equation-8285964c-445e-42a0-b8f9-4418fba5e50f");katex.render("\\frac{|x - x^\\prime|}{\\ell}.", element, {displayMode:true,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});

In this form the $var element = document.getElementById("moose-equation-373a9433-89a5-41d8-9500-05f9d1263049");katex.render("\\ell", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ factor set a relevant length scale for the distance measurements.

When multiple input parameters are to be considered, it may be advantageous to specify $var element = document.getElementById("moose-equation-8b28f3e5-d3fa-4747-9245-a2589ab3e246");katex.render("n", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ different length scales for each of the $var element = document.getElementById("moose-equation-e974295d-a09d-4f13-af61-de16d034fe53");katex.render("n", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ parameters, resulting in a vector $var element = document.getElementById("moose-equation-4853d94c-f56a-4d5a-a780-e8cf7d17ac7f");katex.render("\\vec{\\ell}", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ . For example distance may be calculated as

var element = document.getElementById("moose-equation-249e179d-7bbe-47ce-9e69-2807d9db61d8");katex.render("\\sqrt{ \\sum_{i=1}^n \\left( \\frac{x_i - x^\\prime_i}{\\ell_i} \\right)^2}.", element, {displayMode:true,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});

When used with standardized parameters, $var element = document.getElementById("moose-equation-18b8f0e1-8ac1-4689-bf89-d23c188618bc");katex.render("\\ell", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ can be interpreted in units of standard deviation for the relevant parameter.

Signal Variance $var element = document.getElementById("moose-equation-2746fd9c-ef70-4513-bc17-1b899d92ade9");katex.render("\\sigma_f^2", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$

This serves as an overall scaling parameter. Given a covariance function $var element = document.getElementById("moose-equation-c18a57bf-5920-4cd2-bff8-dd7463e279ea");katex.render("\\tilde{k}", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ (which is not a function of $var element = document.getElementById("moose-equation-543e1777-e1e2-48ce-899c-ededfd410fa6");katex.render("\\sigma_f^2", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ ), the multiplication of $var element = document.getElementById("moose-equation-7f63cc85-075a-4937-a05e-252b6c552b08");katex.render("\\sigma_f^2", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ yields a new valid covariance function.

var element = document.getElementById("moose-equation-61842d77-f8d4-464e-8353-98609449a465");katex.render("k(x,x^\\prime,\\sigma_f) = \\sigma_f^2 \\, \\tilde{k}(x,x^\\prime)", element, {displayMode:true,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});

This multiplication can also be pulled out of the covariance matrix formation, and simply multiply the matrix formed by $var element = document.getElementById("moose-equation-bfadc853-134f-4f40-824d-d6487778952f");katex.render("\\tilde{k}", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$

var element = document.getElementById("moose-equation-b10623a4-f460-4b4d-be27-000f18b7ce50");katex.render("\\mathbf{K}(x,x^\\prime,\\sigma_f) = \\sigma_f^2 \\, \\tilde{\\mathbf{K}}(x,x^\\prime)", element, {displayMode:true,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});

Noise Variance $var element = document.getElementById("moose-equation-cad92632-674c-4dd6-8393-32d1cd11cb0f");katex.render("\\sigma_n^2", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$

The $var element = document.getElementById("moose-equation-e276a47f-e678-4c60-88c7-ec50439148bb");katex.render("\\sigma_n^2", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ represents noise in the collected data, and is as a additional $var element = document.getElementById("moose-equation-8713ea96-2a3d-4727-8d44-3f142960eb86");katex.render("\\sigma_n^2", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ factor on the variance terms (when $var element = document.getElementById("moose-equation-5c0b2079-64fe-4408-9ef5-94bf65151dbf");katex.render("x=x^\\prime", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ ).

var element = document.getElementById("moose-equation-0dffd67f-fb3c-413a-ab3f-3caabdccf224");katex.render("k(x,x^\\prime,\\sigma_f, \\sigma_n) = \\sigma_f^2 \\, \\tilde{k}(x,x^\\prime) + \\sigma_n^2 \\, \\delta_{x,x^\\prime}", element, {displayMode:true,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});

In the matrix representation this adds a factor of $var element = document.getElementById("moose-equation-e4f81953-c867-4488-b1c2-3008e2635bed");katex.render("\\sigma_n^2", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ to diagonal of the noiseless matrix $var element = document.getElementById("moose-equation-a78522ef-df86-48e6-8c8f-c469400d69b8");katex.render("\\tilde{\\mathbf{K}}", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$

var element = document.getElementById("moose-equation-aff8912d-45e4-4318-b633-befc95ca5064");katex.render("\\mathbf{K}(x,x^\\prime,\\sigma_f, \\sigma_n) = \\sigma_f^2 \\, \\tilde{\\mathbf{K}}(x,x^\\prime) + \\sigma_n^2 \\mathbf{I}", element, {displayMode:true,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});

Due to the addition of $var element = document.getElementById("moose-equation-6b7c0341-7061-4eb6-9c53-b5a68556f6e1");katex.render("\\sigma_n^2", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ along the diagonal of the $var element = document.getElementById("moose-equation-0a65285f-c77e-47c0-9a9e-d15f0873f69c");katex.render("K", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ matrix, this hyperparameter can aid in the the inversion of the covariance matrix. For this reason adding a small amount of $var element = document.getElementById("moose-equation-300d8730-31f2-41db-8a01-41fb9c1d8fd3");katex.render("\\sigma_n^2", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ may be preferable, even when you believe the data to be noise free.

Selected Covariance Functions

Table 1: Selected Covariance Functions

Covariance Function	Description
SquaredExponentialCovariance	Also referred to as a radial basis function (RBF) this is a widely used, general purpose covariance function. Serves as a common starting point for many.
ExponentialCovariance	A simple exponential covariance function.
MaternHalfIntCovariance	Implementation of the Matern class of covariance function, where the $var element = document.getElementById("moose-equation-bee340fb-04df-456e-8916-f34dc4eb7626");katex.render("\\nu", element, {displayMode:false,throwOnError:false,macros:{"\\eqc":"\\,,","\\eqp":"\\,.","\\pd":"\\frac{\\partial #1}{\\partial #2}","\\pr":"\\left(#1\\right)","\\ddt":"\\frac{d #1}{d t}"}});$ parameter takes on half-integer values.

Input Parameters

covariance_functionName of covariance function.
C++ Type:UserObjectName
Controllable:No
Description:Name of covariance function.
responseReporter value of response results, can be vpp with /.
C++ Type:ReporterName
Controllable:No
Description:Reporter value of response results, can be vpp with /.
samplerSampler used to create predictor and response data.
C++ Type:SamplerName
Controllable:No
Description:Sampler used to create predictor and response data.

Required Parameters

converged_reporterReporter value used to determine if a sample's multiapp solve converged.
C++ Type:ReporterName
Controllable:No
Description:Reporter value used to determine if a sample's multiapp solve converged.
execute_onTIMESTEP_ENDThe list of flag(s) indicating when this object should be executed, the available options include NONE, INITIAL, LINEAR, NONLINEAR, TIMESTEP_END, TIMESTEP_BEGIN, FINAL, CUSTOM, ALWAYS.
Default:TIMESTEP_END
C++ Type:ExecFlagEnum
Options:NONE, INITIAL, LINEAR, NONLINEAR, TIMESTEP_END, TIMESTEP_BEGIN, FINAL, CUSTOM, ALWAYS
Controllable:No
Description:The list of flag(s) indicating when this object should be executed, the available options include NONE, INITIAL, LINEAR, NONLINEAR, TIMESTEP_END, TIMESTEP_BEGIN, FINAL, CUSTOM, ALWAYS.
predictor_colsSampler columns used as the independent random variables, If 'predictors' and 'predictor_cols' are both empty, all sampler columns are used.
C++ Type:std::vector<unsigned int>
Controllable:No
Description:Sampler columns used as the independent random variables, If 'predictors' and 'predictor_cols' are both empty, all sampler columns are used.
predictorsReporter values used as the independent random variables, If 'predictors' and 'predictor_cols' are both empty, all sampler columns are used.
C++ Type:std::vector<ReporterName>
Controllable:No
Description:Reporter values used as the independent random variables, If 'predictors' and 'predictor_cols' are both empty, all sampler columns are used.
prop_getter_suffixAn optional suffix parameter that can be appended to any attempt to retrieve/get material properties. The suffix will be prepended with a '_' character.
C++ Type:MaterialPropertyName
Controllable:No
Description:An optional suffix parameter that can be appended to any attempt to retrieve/get material properties. The suffix will be prepended with a '_' character.
show_taoFalseSwitch to show TAO solver results
Default:False
C++ Type:bool
Controllable:No
Description:Switch to show TAO solver results
skip_unconverged_samplesFalseTrue to skip samples where the multiapp did not converge, 'stochastic_reporter' is required to do this.
Default:False
C++ Type:bool
Controllable:No
Description:True to skip samples where the multiapp did not converge, 'stochastic_reporter' is required to do this.
standardize_dataTrueStandardize (center and scale) training data (y values)
Default:True
C++ Type:bool
Controllable:No
Description:Standardize (center and scale) training data (y values)
standardize_paramsTrueStandardize (center and scale) training parameters (x values)
Default:True
C++ Type:bool
Controllable:No
Description:Standardize (center and scale) training parameters (x values)
tao_optionsCommand line options for PETSc/TAO hyperparameter optimization
C++ Type:std::string
Controllable:No
Description:Command line options for PETSc/TAO hyperparameter optimization
tune_parametersSelect hyperparameters to be tuned
C++ Type:std::vector<std::string>
Controllable:No
Description:Select hyperparameters to be tuned
tuning_algorithmnoneHyper parameter optimizaton algorithm
Default:none
C++ Type:MooseEnum
Options:tao, none
Controllable:No
Description:Hyper parameter optimizaton algorithm
tuning_maxMaximum allowable tuning value
C++ Type:std::vector<double>
Controllable:No
Description:Maximum allowable tuning value
tuning_minMinimum allowable tuning value
C++ Type:std::vector<double>
Controllable:No
Description:Minimum allowable tuning value

Optional Parameters

allow_duplicate_execution_on_initialFalseIn the case where this UserObject is depended upon by an initial condition, allow it to be executed twice during the initial setup (once before the IC and again after mesh adaptivity (if applicable).
Default:False
C++ Type:bool
Controllable:No
Description:In the case where this UserObject is depended upon by an initial condition, allow it to be executed twice during the initial setup (once before the IC and again after mesh adaptivity (if applicable).
control_tagsAdds user-defined labels for accessing object parameters via control logic.
C++ Type:std::vector<std::string>
Controllable:No
Description:Adds user-defined labels for accessing object parameters via control logic.
enableTrueSet the enabled status of the MooseObject.
Default:True
C++ Type:bool
Controllable:Yes
Description:Set the enabled status of the MooseObject.
force_postauxFalseForces the UserObject to be executed in POSTAUX
Default:False
C++ Type:bool
Controllable:No
Description:Forces the UserObject to be executed in POSTAUX
force_preauxFalseForces the UserObject to be executed in PREAUX
Default:False
C++ Type:bool
Controllable:No
Description:Forces the UserObject to be executed in PREAUX
force_preicFalseForces the UserObject to be executed in PREIC during initial setup
Default:False
C++ Type:bool
Controllable:No
Description:Forces the UserObject to be executed in PREIC during initial setup
use_displaced_meshFalseWhether or not this object should use the displaced mesh for computation. Note that in the case this is true but no displacements are provided in the Mesh block the undisplaced mesh will still be used.
Default:False
C++ Type:bool
Controllable:No
Description:Whether or not this object should use the displaced mesh for computation. Note that in the case this is true but no displacements are provided in the Mesh block the undisplaced mesh will still be used.

Advanced Parameters

Input Files

(modules/stochastic_tools/test/tests/surrogates/gaussian_process/GP_squared_exponential.i)
(modules/stochastic_tools/examples/surrogates/gaussian_process/gaussian_process_uniform_1D_tuned.i)
(modules/stochastic_tools/test/tests/surrogates/gaussian_process/GP_exponential_tuned.i)
(modules/stochastic_tools/test/tests/surrogates/gaussian_process/GP_exponential.i)
(modules/stochastic_tools/examples/surrogates/gaussian_process/gaussian_process_uniform_2D.i)
(modules/stochastic_tools/test/tests/surrogates/gaussian_process/GP_squared_exponential_tuned.i)
(modules/stochastic_tools/test/tests/surrogates/gaussian_process/GP_squared_exponential_training.i)
(modules/stochastic_tools/examples/surrogates/gaussian_process/gaussian_process_uniform_1D.i)
(modules/stochastic_tools/test/tests/surrogates/gaussian_process/GP_Matern_half_int_tuned.i)
(modules/stochastic_tools/examples/surrogates/gaussian_process/gaussian_process_uniform_2D_tuned.i)
(modules/stochastic_tools/test/tests/surrogates/gaussian_process/GP_Matern_half_int.i)
(modules/stochastic_tools/examples/surrogates/gaussian_process/GP_normal_mc.i)

References

Carl Edward Rasmussen and Christopher K. I. Williams. Gaussian Processes for Machine Learning. The MIT Press, 2005. ISBN 026218253X.

@book{rasmussen2005gaussian,
    author = "Rasmussen, Carl Edward and Williams, Christopher K. I.",
    title = "Gaussian Processes for Machine Learning",
    year = "2005",
    isbn = "026218253X",
    publisher = "The MIT Press"
}

(modules/stochastic_tools/test/tests/surrogates/gaussian_process/GP_squared_exponential.i)

[StochasticTools]
[]

[Distributions]
  [k_dist]
    type = Uniform
    lower_bound = 1
    upper_bound = 10
  []
  [q_dist]
    type = Uniform
    lower_bound = 9000
    upper_bound = 11000
  []
[]

[Samplers]
  [train_sample]
    type = MonteCarlo
    num_rows = 10
    distributions = 'k_dist q_dist'
    execute_on = PRE_MULTIAPP_SETUP
  []
  [test_sample]
    type = MonteCarlo
    num_rows = 100
    distributions = 'k_dist q_dist'
    execute_on = PRE_MULTIAPP_SETUP
  []
[]

[MultiApps]
  [sub]
    type = SamplerFullSolveMultiApp
    input_files = sub.i
    sampler = train_sample
  []
[]

[Controls]
  [cmdline]
    type = MultiAppCommandLineControl
    multi_app = sub
    sampler = train_sample
    param_names = 'Materials/conductivity/prop_values Kernels/source/value'
  []
[]

[Transfers]
  [data]
    type = SamplerReporterTransfer
    from_multi_app = sub
    sampler = train_sample
    stochastic_reporter = results
    from_reporter = 'avg/value'
  []
[]

[Reporters]
  [results]
    type = StochasticReporter
    parallel_type = ROOT
  []
  [samp_avg]
    type = EvaluateSurrogate
    model = GP_avg
    sampler = test_sample
    evaluate_std = 'true'
    parallel_type = ROOT
    execute_on = final
  []
  [train_avg]
    type = EvaluateSurrogate
    model = GP_avg
    sampler = train_sample
    evaluate_std = 'true'
    parallel_type = ROOT
    execute_on = final
  []
[]

[VectorPostprocessors]
  [hyperparams]
    type = GaussianProcessData
    gp_name = 'GP_avg'
    execute_on = final
  []
[]

[Trainers]
  [GP_avg_trainer]
    type = GaussianProcessTrainer
    execute_on = timestep_end
    covariance_function = 'covar'             #Choose a squared exponential for the kernel
    standardize_params = 'true'               #Center and scale the training params
    standardize_data = 'true'                 #Center and scale the training data
    sampler = train_sample
    response = results/data:avg:value
  []
[]

[Surrogates]
  [GP_avg]
    type = GaussianProcess
    trainer = GP_avg_trainer
  []
[]

[Covariance]
  [covar]
    type=SquaredExponentialCovariance
    signal_variance = 1                       #Use a signal variance of 1 in the kernel
    noise_variance = 1e-6                     #A small amount of noise can help with numerical stability
    length_factor = '0.38971 0.38971'         #Select a length factor for each parameter (k and q)
  []
[]


[Outputs]
  [out]
    type = CSV
    execute_on = FINAL
  []
[]

(modules/stochastic_tools/examples/surrogates/gaussian_process/gaussian_process_uniform_1D_tuned.i)

[StochasticTools]
[]

[Distributions]
  [k_dist]
    type = Uniform
    lower_bound = 1
    upper_bound = 10
  []
  [q_dist]
    type = Uniform
    lower_bound = 9000
    upper_bound = 11000
  []
  [L_dist]
    type = Uniform
    lower_bound = 0.01
    upper_bound = 0.05
  []
  [Tinf_dist]
    type = Uniform
    lower_bound = 290
    upper_bound = 310
  []
[]

[Samplers]
  [train_sample]
    type = MonteCarlo
    num_rows = 6
    distributions = 'q_dist'
    execute_on = PRE_MULTIAPP_SETUP
  []
  [cart_sample]
    type = CartesianProduct
    linear_space_items = '9000 20 100'
    execute_on = initial
  []
[]

[MultiApps]
  [sub]
    type = SamplerFullSolveMultiApp
    input_files = sub.i
    sampler = train_sample
  []
[]

[Controls]
  [cmdline]
    type = MultiAppCommandLineControl
    multi_app = sub
    sampler = train_sample
    param_names = 'Kernels/source/value'
  []
[]

[Transfers]
  [data]
    type = SamplerReporterTransfer
    from_multi_app = sub
    sampler = train_sample
    stochastic_reporter = results
    from_reporter = 'avg/value'
  []
[]

[Reporters]
  [results]
    type = StochasticReporter
  []
[]

[Trainers]
  [GP_avg_trainer]
    type = GaussianProcessTrainer
    execute_on = timestep_end
    covariance_function = 'rbf'
    standardize_params = 'true'               #Center and scale the training params
    standardize_data = 'true'                 #Center and scale the training data
    sampler = train_sample
    response = results/data:avg:value
    tao_options = '-tao_bncg_type kd'
    tune_parameters = ' signal_variance length_factor'
    tuning_min = ' 1e-9 1e-9'
    tuning_max = ' 1e16  1e16'
    tuning_algorithm = 'tao'
  []
[]

[Covariance]
  [rbf]
    type=SquaredExponentialCovariance
    signal_variance = 1                       #Use a signal variance of 1 in the kernel
    noise_variance = 1e-3                     #A small amount of noise can help with numerical stability
    length_factor = '0.38971'         #Select a length factor for each parameter (k and q)
  []
[]

[Surrogates]
  [gauss_process_avg]
    type = GaussianProcess
    trainer = 'GP_avg_trainer'
  []
[]

# # Computing statistics
[VectorPostprocessors]
  [hyperparams]
    type = GaussianProcessData
    gp_name = 'gauss_process_avg'
    execute_on = final
  []
[]

[Reporters]
  [cart_avg]
    type = EvaluateSurrogate
    model = gauss_process_avg
    sampler = cart_sample
    evaluate_std = 'true'
    parallel_type = ROOT
    execute_on = final
  []
  [train_avg]
    type = EvaluateSurrogate
    model = gauss_process_avg
    sampler = train_sample
    evaluate_std = 'true'
    parallel_type = ROOT
    execute_on = final
  []
[]


[Outputs]
  csv = true
  execute_on = FINAL
[]

(modules/stochastic_tools/test/tests/surrogates/gaussian_process/GP_exponential_tuned.i)

[StochasticTools]
[]

[Distributions]
  [k_dist]
    type = Uniform
    lower_bound = 1
    upper_bound = 10
  []
  [q_dist]
    type = Uniform
    lower_bound = 9000
    upper_bound = 11000
  []
[]

[Samplers]
  [train_sample]
    type = MonteCarlo
    num_rows = 10
    distributions = 'k_dist q_dist'
    execute_on = PRE_MULTIAPP_SETUP
  []
  [test_sample]
    type = MonteCarlo
    num_rows = 100
    distributions = 'k_dist q_dist'
    execute_on = PRE_MULTIAPP_SETUP
  []
[]

[MultiApps]
  [sub]
    type = SamplerFullSolveMultiApp
    input_files = sub.i
    sampler = train_sample
  []
[]

[Controls]
  [cmdline]
    type = MultiAppCommandLineControl
    multi_app = sub
    sampler = train_sample
    param_names = 'Materials/conductivity/prop_values Kernels/source/value'
  []
[]

[Transfers]
  [data]
    type = SamplerReporterTransfer
    from_multi_app = sub
    sampler = train_sample
    stochastic_reporter = results
    from_reporter = 'avg/value'
  []
[]

[Reporters]
  [results]
    type = StochasticReporter
    parallel_type = ROOT
  []
  [samp_avg]
    type = EvaluateSurrogate
    model = GP_avg
    sampler = test_sample
    evaluate_std = 'true'
    parallel_type = ROOT
    execute_on = final
  []
  [train_avg]
    type = EvaluateSurrogate
    model = GP_avg
    sampler = train_sample
    evaluate_std = 'true'
    parallel_type = ROOT
    execute_on = final
  []
[]

[VectorPostprocessors]
  [hyperparams]
    type = GaussianProcessData
    gp_name = 'GP_avg'
    execute_on = final
  []
[]

[Trainers]
  [GP_avg_trainer]
    type = GaussianProcessTrainer
    execute_on = timestep_end
    covariance_function = 'covar'           #Choose an exponential for the kernel
    standardize_params = 'true'               #Center and scale the training params
    standardize_data = 'true'                 #Center and scale the training data
    sampler = train_sample
    response = results/data:avg:value
    tao_options = '-tao_bncg_type ssml_bfgs'
    tune_parameters = 'signal_variance length_factor'
    tuning_min = ' 1e-9 1e-9'
    tuning_max = ' 1e16  1e16'
    tuning_algorithm = 'tao'
  []
[]

[Surrogates]
  [GP_avg]
    type = GaussianProcess
    trainer = GP_avg_trainer
  []
[]

[Covariance]
  [covar]
    type=ExponentialCovariance
    gamma = 2                                 #Define the exponential factor
    signal_variance = 1                       #Use a signal variance of 1 in the kernel
    noise_variance = 1e-3                     #A small amount of noise can help with numerical stability
    length_factor = '0.551133 0.551133'       #Select a length factor for each parameter (k and q)
  []
[]

[Outputs]
  [out]
    type = CSV
    execute_on = FINAL
  []
[]

(modules/stochastic_tools/test/tests/surrogates/gaussian_process/GP_exponential.i)

[StochasticTools]
[]

[Distributions]
  [k_dist]
    type = Uniform
    lower_bound = 1
    upper_bound = 10
  []
  [q_dist]
    type = Uniform
    lower_bound = 9000
    upper_bound = 11000
  []
[]

[Samplers]
  [train_sample]
    type = MonteCarlo
    num_rows = 10
    distributions = 'k_dist q_dist'
    execute_on = PRE_MULTIAPP_SETUP
  []
  [test_sample]
    type = MonteCarlo
    num_rows = 100
    distributions = 'k_dist q_dist'
    execute_on = PRE_MULTIAPP_SETUP
  []
[]

[MultiApps]
  [sub]
    type = SamplerFullSolveMultiApp
    input_files = sub.i
    sampler = train_sample
  []
[]

[Controls]
  [cmdline]
    type = MultiAppCommandLineControl
    multi_app = sub
    sampler = train_sample
    param_names = 'Materials/conductivity/prop_values Kernels/source/value'
  []
[]

[Transfers]
  [data]
    type = SamplerReporterTransfer
    from_multi_app = sub
    sampler = train_sample
    stochastic_reporter = results
    from_reporter = 'avg/value'
  []
[]

[Reporters]
  [results]
    type = StochasticReporter
    parallel_type = ROOT
  []
  [samp_avg]
    type = EvaluateSurrogate
    model = GP_avg
    sampler = test_sample
    evaluate_std = 'true'
    parallel_type = ROOT
    execute_on = final
  []
  [train_avg]
    type = EvaluateSurrogate
    model = GP_avg
    sampler = train_sample
    evaluate_std = 'true'
    parallel_type = ROOT
    execute_on = final
  []
[]

[VectorPostprocessors]
  [hyperparams]
    type = GaussianProcessData
    gp_name = 'GP_avg'
    execute_on = final
  []
[]

[Trainers]
  [GP_avg_trainer]
    type = GaussianProcessTrainer
    execute_on = timestep_end
    covariance_function = 'covar'           #Choose an exponential for the kernel
    standardize_params = 'true'               #Center and scale the training params
    standardize_data = 'true'                 #Center and scale the training data
    sampler = train_sample
    response = results/data:avg:value
  []
[]

[Surrogates]
  [GP_avg]
    type = GaussianProcess
    trainer = GP_avg_trainer
  []
[]

[Covariance]
  [covar]
    type=ExponentialCovariance
    gamma = 1                                 #Define the exponential factor
    signal_variance = 1                       #Use a signal variance of 1 in the kernel
    noise_variance = 1e-6                     #A small amount of noise can help with numerical stability
    length_factor = '0.551133 0.551133'       #Select a length factor for each parameter (k and q)
  []
[]

[Outputs]
  [out]
    type = CSV
    execute_on = FINAL
  []
[]

(modules/stochastic_tools/examples/surrogates/gaussian_process/gaussian_process_uniform_2D.i)

[StochasticTools]
[]

[Distributions]
  [k_dist]
    type = Uniform
    lower_bound = 1
    upper_bound = 10
  []
  [q_dist]
    type = Uniform
    lower_bound = 9000
    upper_bound = 11000
  []
[]

[Samplers]
  [train_sample]
    type = MonteCarlo
    num_rows = 50
    distributions = 'k_dist q_dist'
    execute_on = PRE_MULTIAPP_SETUP
  []
  [cart_sample]
    type = CartesianProduct
    linear_space_items = '1 0.09 100
                          9000 20 100 '
    execute_on = initial
  []
[]

[MultiApps]
  [sub]
    type = SamplerFullSolveMultiApp
    input_files = sub.i
    sampler = train_sample
  []
[]

[Controls]
  [cmdline]
    type = MultiAppCommandLineControl
    multi_app = sub
    sampler = train_sample
    param_names = 'Materials/conductivity/prop_values Kernels/source/value'
  []
[]

[Transfers]
  [data]
    type = SamplerReporterTransfer
    from_multi_app = sub
    sampler = train_sample
    stochastic_reporter = results
    from_reporter = 'avg/value'
  []
[]

[Reporters]
  [results]
    type = StochasticReporter
  []
  [train_avg]
    type = EvaluateSurrogate
    model = GP_avg
    sampler = train_sample
    evaluate_std = 'true'
    parallel_type = ROOT
    execute_on = final
  []
  [cart_avg]
    type = EvaluateSurrogate
    model = GP_avg
    sampler = cart_sample
    evaluate_std = 'true'
    parallel_type = ROOT
    execute_on = final
  []
[]

[Trainers]
  [GP_avg_trainer]
    type = GaussianProcessTrainer
    execute_on = timestep_end
    covariance_function = 'rbf'
    standardize_params = 'true'               #Center and scale the training params
    standardize_data = 'true'                 #Center and scale the training data
    sampler = train_sample
    response = results/data:avg:value
  []
[]


[Covariance]
  [rbf]
    type=SquaredExponentialCovariance
    signal_variance = 1                       #Use a signal variance of 1 in the kernel
    noise_variance = 1e-6                     #A small amount of noise can help with numerical stability
    length_factor = '0.38971 0.38971'         #Select a length factor for each parameter (k and q)
  []
[]

[Surrogates]
  [GP_avg]
    type = GaussianProcess
    trainer = 'GP_avg_trainer'
  []
[]

[VectorPostprocessors]
  [hyperparams]
    type = GaussianProcessData
    gp_name = 'GP_avg'
    execute_on = final
  []
[]

[Outputs]
  [out]
    type = CSV
    execute_on = FINAL
  []
[]

(modules/stochastic_tools/test/tests/surrogates/gaussian_process/GP_squared_exponential_tuned.i)

[StochasticTools]
[]

[Distributions]
  [k_dist]
    type = Uniform
    lower_bound = 1
    upper_bound = 10
  []
  [q_dist]
    type = Uniform
    lower_bound = 9000
    upper_bound = 11000
  []
[]

[Samplers]
  [train_sample]
    type = MonteCarlo
    num_rows = 10
    distributions = 'k_dist q_dist'
    execute_on = PRE_MULTIAPP_SETUP
  []
  [test_sample]
    type = MonteCarlo
    num_rows = 100
    distributions = 'k_dist q_dist'
    execute_on = PRE_MULTIAPP_SETUP
  []
[]

[MultiApps]
  [sub]
    type = SamplerFullSolveMultiApp
    input_files = sub.i
    sampler = train_sample
  []
[]

[Controls]
  [cmdline]
    type = MultiAppCommandLineControl
    multi_app = sub
    sampler = train_sample
    param_names = 'Materials/conductivity/prop_values Kernels/source/value'
  []
[]

[Transfers]
  [data]
    type = SamplerReporterTransfer
    from_multi_app = sub
    sampler = train_sample
    stochastic_reporter = results
    from_reporter = 'avg/value'
  []
[]

[Reporters]
  [results]
    type = StochasticReporter
    parallel_type = ROOT
  []
  [samp_avg]
    type = EvaluateSurrogate
    model = GP_avg
    sampler = test_sample
    evaluate_std = 'true'
    parallel_type = ROOT
    execute_on = final
  []
  [train_avg]
    type = EvaluateSurrogate
    model = GP_avg
    sampler = train_sample
    evaluate_std = 'true'
    parallel_type = ROOT
    execute_on = final
  []
[]

[VectorPostprocessors]
  [hyperparams]
    type = GaussianProcessData
    gp_name = 'GP_avg'
    execute_on = final
  []
[]

[Trainers]
  [GP_avg_trainer]
    type = GaussianProcessTrainer
    execute_on = timestep_end
    covariance_function = 'covar'             #Choose a squared exponential for the kernel
    standardize_params = 'true'               #Center and scale the training params
    standardize_data = 'true'                 #Center and scale the training data
    sampler = train_sample
    response = results/data:avg:value
    tao_options = '-tao_bncg_type ssml_bfgs'
    tune_parameters = ' signal_variance length_factor'
    tuning_min = ' 1e-9 1e-9'
    tuning_max = ' 1e16  1e16'
    tuning_algorithm = 'tao'
    show_tao=true
  []
[]

[Surrogates]
  [GP_avg]
    type = GaussianProcess
    trainer = GP_avg_trainer
  []
[]

[Covariance]
  [covar]
    type=SquaredExponentialCovariance
    signal_variance = 1                       #Use a signal variance of 1 in the kernel
    noise_variance = 1e-3                     #A small amount of noise can help with numerical stability
    length_factor = '0.38971 0.38971'         #Select a length factor for each parameter (k and q)
  []
[]


[Outputs]
  [out]
    type = CSV
    execute_on = FINAL
  []
[]

(modules/stochastic_tools/test/tests/surrogates/gaussian_process/GP_squared_exponential_training.i)

[StochasticTools]
[]

[Distributions]
  [k_dist]
    type = Uniform
    lower_bound = 1
    upper_bound = 10
  []
  [q_dist]
    type = Uniform
    lower_bound = 9000
    upper_bound = 11000
  []
[]

[Samplers]
  [train_sample]
    type = MonteCarlo
    num_rows = 10
    distributions = 'k_dist q_dist'
    execute_on = PRE_MULTIAPP_SETUP
  []
[]

[MultiApps]
  [sub]
    type = SamplerFullSolveMultiApp
    input_files = sub.i
    sampler = train_sample
  []
[]

[Controls]
  [cmdline]
    type = MultiAppCommandLineControl
    multi_app = sub
    sampler = train_sample
    param_names = 'Materials/conductivity/prop_values Kernels/source/value'
  []
[]

[Transfers]
  [data]
    type = SamplerReporterTransfer
    from_multi_app = sub
    sampler = train_sample
    stochastic_reporter = results
    from_reporter = 'avg/value'
  []
[]

[Reporters]
  [results]
    type = StochasticReporter
    parallel_type = ROOT
  []
[]

[Trainers]
  [GP_avg_trainer]
    type = GaussianProcessTrainer
    execute_on = timestep_end
    covariance_function = 'covar'             #Choose a squared exponential for the kernel
    standardize_params = 'true'               #Center and scale the training params
    standardize_data = 'true'                 #Center and scale the training data
    sampler = train_sample
    response = results/data:avg:value
  []
[]

[Covariance]
  [covar]
    type=SquaredExponentialCovariance
    signal_variance = 1                       #Use a signal variance of 1 in the kernel
    noise_variance = 1e-6                     #A small amount of noise can help with numerical stability
    length_factor = '0.38971 0.38971'         #Select a length factor for each parameter (k and q)
  []
[]

[Outputs]
  file_base = gauss_process_training
  [out]
    type = SurrogateTrainerOutput
    trainers = 'GP_avg_trainer'
    execute_on = FINAL
  []
[]

(modules/stochastic_tools/examples/surrogates/gaussian_process/gaussian_process_uniform_1D.i)

[StochasticTools]
[]

[Distributions]
  [k_dist]
    type = Uniform
    lower_bound = 1
    upper_bound = 10
  []
  [q_dist]
    type = Uniform
    lower_bound = 9000
    upper_bound = 11000
  []
  [L_dist]
    type = Uniform
    lower_bound = 0.01
    upper_bound = 0.05
  []
  [Tinf_dist]
    type = Uniform
    lower_bound = 290
    upper_bound = 310
  []
[]

[Samplers]
  [train_sample]
    type = MonteCarlo
    num_rows = 6
    distributions = 'q_dist'
    execute_on = PRE_MULTIAPP_SETUP
  []
  [cart_sample]
    type = CartesianProduct
    linear_space_items = '9000 20 100'
    execute_on = initial
  []
[]

[MultiApps]
  [sub]
    type = SamplerFullSolveMultiApp
    input_files = sub.i
    sampler = train_sample
  []
[]

[Controls]
  [cmdline]
    type = MultiAppCommandLineControl
    multi_app = sub
    sampler = train_sample
    param_names = 'Kernels/source/value'
  []
[]

[Transfers]
  [data]
    type = SamplerReporterTransfer
    from_multi_app = sub
    sampler = train_sample
    stochastic_reporter = results
    from_reporter = 'avg/value'
  []
[]

[Reporters]
  [results]
    type = StochasticReporter
  []
  [cart_avg]
    type = EvaluateSurrogate
    model = gauss_process_avg
    sampler = cart_sample
    evaluate_std = 'true'
    parallel_type = ROOT
    execute_on = final
  []
  [train_avg]
    type = EvaluateSurrogate
    model = gauss_process_avg
    sampler = train_sample
    evaluate_std = 'true'
    parallel_type = ROOT
    execute_on = final
  []
[]

[Trainers]
  [GP_avg_trainer]
    type = GaussianProcessTrainer
    execute_on = timestep_end
    sampler = train_sample
    response = results/data:avg:value
    covariance_function = 'rbf'
    standardize_params = 'true'               #Center and scale the training params
    standardize_data = 'true'                 #Center and scale the training data
  []
[]

[Covariance]
  [rbf]
    type=SquaredExponentialCovariance
    signal_variance = 1                       #Use a signal variance of 1 in the kernel
    noise_variance = 1e-3                     #A small amount of noise can help with numerical stability
    length_factor = '0.38971'         #Select a length factor for each parameter (k and q)
  []
[]

[Surrogates]
  [gauss_process_avg]
    type = GaussianProcess
    trainer = 'GP_avg_trainer'
  []
[]

[Outputs]
  csv = true
  execute_on = FINAL
[]

(modules/stochastic_tools/test/tests/surrogates/gaussian_process/GP_Matern_half_int_tuned.i)

[StochasticTools]
[]

[Distributions]
  [k_dist]
    type = Uniform
    lower_bound = 1
    upper_bound = 10
  []
  [q_dist]
    type = Uniform
    lower_bound = 9000
    upper_bound = 11000
  []
[]

[Samplers]
  [train_sample]
    type = MonteCarlo
    num_rows = 10
    distributions = 'k_dist q_dist'
    execute_on = PRE_MULTIAPP_SETUP
  []
  [test_sample]
    type = MonteCarlo
    num_rows = 100
    distributions = 'k_dist q_dist'
    execute_on = PRE_MULTIAPP_SETUP
  []
[]

[MultiApps]
  [sub]
    type = SamplerFullSolveMultiApp
    input_files = sub.i
    sampler = train_sample
  []
[]

[Controls]
  [cmdline]
    type = MultiAppCommandLineControl
    multi_app = sub
    sampler = train_sample
    param_names = 'Materials/conductivity/prop_values Kernels/source/value'
  []
[]

[Transfers]
  [data]
    type = SamplerReporterTransfer
    from_multi_app = sub
    sampler = train_sample
    stochastic_reporter = results
    from_reporter = 'avg/value'
  []
[]

[Reporters]
  [results]
    type = StochasticReporter
    parallel_type = ROOT
  []
  [samp_avg]
    type = EvaluateSurrogate
    model = GP_avg
    sampler = test_sample
    evaluate_std = 'true'
    parallel_type = ROOT
    execute_on = final
  []
  [train_avg]
    type = EvaluateSurrogate
    model = GP_avg
    sampler = train_sample
    evaluate_std = 'true'
    parallel_type = ROOT
    execute_on = final
  []
[]

[VectorPostprocessors]
  [hyperparams]
    type = GaussianProcessData
    gp_name = 'GP_avg'
    execute_on = final
  []
[]

[Trainers]
  [GP_avg_trainer]
    type = GaussianProcessTrainer
    execute_on = timestep_end
    covariance_function = 'covar'   #Choose a Matern with half-integer argument for the kernel
    standardize_params = 'true'           #Center and scale the training params
    standardize_data = 'true'             #Center and scale the training data
    sampler = train_sample
    response = results/data:avg:value
    tao_options = '-tao_bncg_type ssml_bfgs'
    tune_parameters = ' signal_variance length_factor'
    tuning_min = ' 1e-9 1e-9'
    tuning_max = ' 1e16  1e16'
    tuning_algorithm = 'tao'
  []
[]

[Surrogates]
  [GP_avg]
    type = GaussianProcess
    trainer = GP_avg_trainer
  []
[]

[Covariance]
  [covar]
    type=MaternHalfIntCovariance
    p = 2                                 #Define the exponential factor
    signal_variance = 1                       #Use a signal variance of 1 in the kernel
    noise_variance = 1e-3                     #A small amount of noise can help with numerical stability
    length_factor = '0.551133 0.551133'       #Select a length factor for each parameter (k and q)
  []
[]

[Outputs]
  [out]
    type = CSV
    execute_on = FINAL
  []
[]

(modules/stochastic_tools/examples/surrogates/gaussian_process/gaussian_process_uniform_2D_tuned.i)

[StochasticTools]
[]

[Distributions]
  [k_dist]
    type = Uniform
    lower_bound = 1
    upper_bound = 10
  []
  [q_dist]
    type = Uniform
    lower_bound = 9000
    upper_bound = 11000
  []
[]

[Samplers]
  [train_sample]
    type = MonteCarlo
    num_rows = 50
    distributions = 'k_dist q_dist'
    execute_on = PRE_MULTIAPP_SETUP
  []
  [cart_sample]
    type = CartesianProduct
    linear_space_items = '1 0.09 100
                          9000 20 100 '
    execute_on = initial
  []
[]

[MultiApps]
  [sub]
    type = SamplerFullSolveMultiApp
    input_files = sub.i
    sampler = train_sample
  []
[]

[Controls]
  [cmdline]
    type = MultiAppCommandLineControl
    multi_app = sub
    sampler = train_sample
    param_names = 'Materials/conductivity/prop_values Kernels/source/value'
  []
[]

[Transfers]
  [data]
    type = SamplerReporterTransfer
    from_multi_app = sub
    sampler = train_sample
    stochastic_reporter = results
    from_reporter = 'avg/value'
  []
[]

[Reporters]
  [results]
    type = StochasticReporter
  []
[]


[Trainers]
  [GP_avg_trainer]
    type = GaussianProcessTrainer
    execute_on = timestep_end
    covariance_function = 'rbf'
    standardize_params = 'true'               #Center and scale the training params
    standardize_data = 'true'                 #Center and scale the training data
    tao_options = '-tao_bncg_type kd'
    sampler = train_sample
    response = results/data:avg:value
    tune_parameters = ' signal_variance length_factor'
    tuning_min = ' 1e-9 1e-9'
    tuning_max = ' 1e16  1e16'
    tuning_algorithm = 'tao'
  []
[]


[Covariance]
  [rbf]
    type=SquaredExponentialCovariance
    signal_variance = 1                       #Use a signal variance of 1 in the kernel
    noise_variance = 1e-3                     #A small amount of noise can help with numerical stability
    length_factor = '0.38971 0.38971'         #Select a length factor for each parameter (k and q)
  []
[]

[Surrogates]
  [GP_avg]
    type = GaussianProcess
    trainer = 'GP_avg_trainer'
  []
[]

[Reporters]
  [train_avg]
    type = EvaluateSurrogate
    model = GP_avg
    sampler = train_sample
    evaluate_std = 'true'
    parallel_type = ROOT
    execute_on = final
  []
  [cart_avg]
    type = EvaluateSurrogate
    model = GP_avg
    sampler = cart_sample
    evaluate_std = 'true'
    parallel_type = ROOT
    execute_on = final
  []
[]

[VectorPostprocessors]
  [hyperparams]
    type = GaussianProcessData
    gp_name = 'GP_avg'
    execute_on = final
  []
[]

[Outputs]
  [out]
    type = CSV
    execute_on = FINAL
  []
[]

(modules/stochastic_tools/test/tests/surrogates/gaussian_process/GP_Matern_half_int.i)

[StochasticTools]
[]

[Distributions]
  [k_dist]
    type = Uniform
    lower_bound = 1
    upper_bound = 10
  []
  [q_dist]
    type = Uniform
    lower_bound = 9000
    upper_bound = 11000
  []
[]

[Samplers]
  [train_sample]
    type = MonteCarlo
    num_rows = 10
    distributions = 'k_dist q_dist'
    execute_on = PRE_MULTIAPP_SETUP
  []
  [test_sample]
    type = MonteCarlo
    num_rows = 100
    distributions = 'k_dist q_dist'
    execute_on = PRE_MULTIAPP_SETUP
  []
[]

[MultiApps]
  [sub]
    type = SamplerFullSolveMultiApp
    input_files = sub.i
    sampler = train_sample
  []
[]

[Controls]
  [cmdline]
    type = MultiAppCommandLineControl
    multi_app = sub
    sampler = train_sample
    param_names = 'Materials/conductivity/prop_values Kernels/source/value'
  []
[]

[Transfers]
  [data]
    type = SamplerReporterTransfer
    from_multi_app = sub
    sampler = train_sample
    stochastic_reporter = results
    from_reporter = 'avg/value'
  []
[]

[Reporters]
  [results]
    type = StochasticReporter
    parallel_type = ROOT
  []
  [samp_avg]
    type = EvaluateSurrogate
    model = GP_avg
    sampler = test_sample
    evaluate_std = 'true'
    parallel_type = ROOT
    execute_on = final
  []
  [train_avg]
    type = EvaluateSurrogate
    model = GP_avg
    sampler = train_sample
    evaluate_std = 'true'
    parallel_type = ROOT
    execute_on = final
  []
[]

[VectorPostprocessors]
  [hyperparams]
    type = GaussianProcessData
    gp_name = 'GP_avg'
    execute_on = final
  []
[]

[Trainers]
  [GP_avg_trainer]
    type = GaussianProcessTrainer
    execute_on = timestep_end
    covariance_function = 'covar'   #Choose a Matern with half-integer argument for the kernel
    standardize_params = 'true'           #Center and scale the training params
    standardize_data = 'true'             #Center and scale the training data
    sampler = train_sample
    response = results/data:avg:value
  []
[]

[Surrogates]
  [GP_avg]
    type = GaussianProcess
    trainer = GP_avg_trainer
  []
[]

[Covariance]
  [covar]
    type=MaternHalfIntCovariance
    p = 2                                 #Define the exponential factor
    signal_variance = 1                       #Use a signal variance of 1 in the kernel
    noise_variance = 1e-6                     #A small amount of noise can help with numerical stability
    length_factor = '0.551133 0.551133'       #Select a length factor for each parameter (k and q)
  []
[]

[Outputs]
  [out]
    type = CSV
    execute_on = FINAL
  []
[]

(modules/stochastic_tools/examples/surrogates/gaussian_process/GP_normal_mc.i)

[StochasticTools]
[]

[Distributions]
  [k_dist]
    type = Uniform
    lower_bound = 0
    upper_bound = 20
  []
  [q_dist]
    type = Uniform
    lower_bound = 7000
    upper_bound = 13000
  []
  [L_dist]
    type = Uniform
    lower_bound = 0.0
    upper_bound = 0.1
  []
  [Tinf_dist]
    type = Uniform
    lower_bound = 270
    upper_bound = 330
  []
[]

[Samplers]
  [sample]
    type = MonteCarlo
    num_rows = 500
    distributions = 'k_dist q_dist L_dist Tinf_dist'
    execute_on = PRE_MULTIAPP_SETUP
  []
[]

[MultiApps]
  [sub]
    type = SamplerFullSolveMultiApp
    input_files = sub.i
    sampler = sample
  []
[]

[Controls]
  [cmdline]
    type = MultiAppCommandLineControl
    multi_app = sub
    sampler = sample
    param_names = 'Materials/conductivity/prop_values Kernels/source/value Mesh/xmax BCs/right/value'
  []
[]

[Transfers]
  [data]
    type = SamplerReporterTransfer
    from_multi_app = sub
    sampler = sample
    stochastic_reporter = results
    from_reporter = 'avg/value'
  []
[]

[Reporters]
  [results]
    type = StochasticReporter
  []
[]

[Trainers]
  [GP_avg]
    type = GaussianProcessTrainer
    execute_on = timestep_end
    covariance_function = 'rbf'
    standardize_params = 'true'               #Center and scale the training params
    standardize_data = 'true'                 #Center and scale the training data
    sampler = sample
    response = results/data:avg:value
    tao_options = '-tao_bncg_type gd'
    tune_parameters = ' signal_variance length_factor'
    tuning_min = ' 1e-9 1e-3'
    tuning_max = ' 100  100'
    tuning_algorithm = 'tao'
  []
[]

[Covariance]
  [rbf]
    type=SquaredExponentialCovariance
    noise_variance = 1e-3                     #A small amount of noise can help with numerical stability
    signal_variance = 1
    length_factor = '0.038971 0.038971 0.038971 0.038971' #Select a length factor for each parameter (k and q)
  []
[]

[Outputs]
  file_base = GP_training_normal
  [out]
    type = SurrogateTrainerOutput
    trainers = 'GP_avg'
    execute_on = FINAL
  []
[]

Common Hyperparameters
Selected Covariance Functions
Input Parameters
Input Files
References

Install MOOSE

New Users

Examples and Tutorials

Application Usage

Physics and Syntax

Application Development

Framework Development

MOOSEDocs

Infrastructure

Questions

Information and Tools

INL Applications and Remote Access