PerfGraphReporterReader
A python utility that provides an interface for reading PerfGraphReporter output. It rebuilds the graph for easy traversal via the PerfGraphNode
and PerfGraphSection
objects.
Example usage
Take the following simple diffusion problem, which has a PerfGraphReporter set to output on final:
[Mesh]
[gmg]
type = GeneratedMeshGenerator
dim = 2
nx = 1
ny = 1
[]
[]
[Variables/u]
[]
[Kernels/diff]
type = Diffusion
variable = u
[]
[BCs]
[left]
type = DirichletBC
variable = u
boundary = left
value = 0
[]
[right]
type = DirichletBC
variable = u
boundary = right
value = 1
[]
[]
[Executioner]
type = Steady
solve_type = 'PJFNK'
petsc_options_iname = '-pc_type -pc_hypre_type'
petsc_options_value = 'hypre boomeramg'
[]
[Reporters/perf_graph]
type = PerfGraphReporter
execute_on = FINAL
[]
[Outputs/json]
type = JSON
execute_on = 'INITIAL FINAL'
[]
(test/tests/reporters/perf_graph_reporter/perf_graph_reporter.i)For more real-world-like timing, we will pass the command line arguments "Mesh/gmg/nx=500 Mesh/gmg/ny=500
", as the test above is executed with only a single element. With this, we will execute the following (where MOOSE_DIR
is an environment variable set to the directory that contains MOOSE):
$MOOSE_DIR/test/moose_test-opt -i $MOOSE_DIR/test/tests/reporters/perf_graph_reporter/perf_graph_reporter.i Mesh/gmg/nx=500 Mesh/gmg/ny=500
This run will generate the desired output in $MOOSE_DIR/test/tests/reporters/perf_graph_reporter/perf_graph_reporter_json.json
.
Load a PerfGraphReporterReader
with the given output as such:
import os
from mooseutils.PerfGraphReporterReader import PerfGraphReporterReader
MOOSE_DIR = os.environ.get('MOOSE_DIR')
pgrr = PerfGraphReporterReader(MOOSE_DIR + '/test/tests/reporters/perf_graph_reporter/perf_graph_reporter_json.json')
The examples that follow show detail by calling the info()
method on each PerfGraphSection
and PerfGraphNode
. This is typically not the best way to report the data (you should likely use the member methods on said classes to access the data you need), but it is appropriate here for the purpose of showing usage. For more information on the member methods in said classes, see Reference for references to methods in each of the previous listed classes.
Heaviest sections
We can determine the five heaviest sections with:
for section in pgrr.heaviestSections(5):
print(section)
PerfGraphSection "NonlinearSystemBase::Kernels"
PerfGraphSection "FEProblem::EquationSystems::Init"
PerfGraphSection "NonlinearSystemBase::computeJacobianTags"
PerfGraphSection "FEProblem::solve"
PerfGraphSection "MeshGeneratorMesh::cacheInfo"
Similarly, we can obtain more detailed output with:
for section in pgrr.heaviestSections(5):
print(section.info())
PerfGraphSection "NonlinearSystemBase::Kernels":
Num calls: 15
Level: 3
Time (41.36%): Self 4.34 s, Children 0.00 s, Total 4.34 s
Memory (0.20%): Self 1 MB, Children 0 MB, Total 1 MB
Nodes:
- MooseTestApp (main)
MooseApp::run
MooseApp::execute
MooseApp::executeExecutioner
Steady::PicardSolve
FEProblem::solve
FEProblem::computeResidualSys
FEProblem::computeResidualInternal
FEProblem::computeResidualTags
NonlinearSystemBase::nl::computeResidualTags
NonlinearSystemBase::computeResidualInternal
NonlinearSystemBase::Kernels (14 call(s), 38.3% time, 0.0% memory)
- MooseTestApp (main)
MooseApp::run
MooseApp::execute
MooseApp::executeExecutioner
Steady::PicardSolve
FEProblem::solve
NonlinearSystemBase::nlInitialResidual
FEProblem::computeResidualSys
FEProblem::computeResidualInternal
FEProblem::computeResidualTags
NonlinearSystemBase::nl::computeResidualTags
NonlinearSystemBase::computeResidualInternal
NonlinearSystemBase::Kernels (1 call(s), 3.0% time, 0.2% memory)
PerfGraphSection "FEProblem::EquationSystems::Init":
Num calls: 1
Level: 2
Time (25.50%): Self 2.68 s, Children 0.00 s, Total 2.68 s
Memory (17.59%): Self 86 MB, Children 0 MB, Total 86 MB
Nodes:
- MooseTestApp (main)
MooseApp::run
MooseApp::setup
MooseApp::runInputFile
Action::InitProblemAction::act
FEProblem::init
FEProblem::EquationSystems::Init (1 call(s), 25.5% time, 17.6% memory)
PerfGraphSection "NonlinearSystemBase::computeJacobianTags":
Num calls: 2
Level: 5
Time (11.29%): Self 1.19 s, Children 0.00 s, Total 1.19 s
Memory (4.50%): Self 22 MB, Children 0 MB, Total 22 MB
Nodes:
- MooseTestApp (main)
MooseApp::run
MooseApp::execute
MooseApp::executeExecutioner
Steady::PicardSolve
FEProblem::solve
FEProblem::computeJacobianInternal
FEProblem::computeJacobianTags
NonlinearSystemBase::computeJacobianTags (2 call(s), 11.3% time, 4.5% memory)
PerfGraphSection "FEProblem::solve":
Num calls: 1
Level: 1
Time (60.96%): Self 0.85 s, Children 5.55 s, Total 6.40 s
Memory (53.58%): Self 239 MB, Children 23 MB, Total 262 MB
Nodes:
- MooseTestApp (main)
MooseApp::run
MooseApp::execute
MooseApp::executeExecutioner
Steady::PicardSolve
FEProblem::solve (1 call(s), 61.0% time, 53.6% memory)
PerfGraphSection "MeshGeneratorMesh::cacheInfo":
Num calls: 2
Level: 3
Time (3.69%): Self 0.39 s, Children 0.00 s, Total 0.39 s
Memory (5.11%): Self 25 MB, Children 0 MB, Total 25 MB
Nodes:
- MooseTestApp (main)
MooseApp::run
MooseApp::setup
MooseApp::runInputFile
Action::InitProblemAction::act
FEProblem::init
MeshGeneratorMesh::meshChanged
MeshGeneratorMesh::update
MeshGeneratorMesh::cacheInfo (1 call(s), 2.1% time, 0.2% memory)
- MooseTestApp (main)
MooseApp::run
MooseApp::setup
MooseApp::runInputFile
Action::SetupMeshCompleteAction::Mesh::act
Action::SetupMeshCompleteAction::Mesh::completeSetupUndisplaced
MeshGeneratorMesh::prepare
MeshGeneratorMesh::update
MeshGeneratorMesh::cacheInfo (1 call(s), 1.6% time, 4.9% memory)
Section by name
Let's say we know that we want to look the timing of all kernel evaluations. With this, we're interested in the section NonlinearSystemBase::Kernels
.
Obtain the section in question and show its information with:
kernels_section = pgrr.section('NonlinearSystemBase::Kernels')
print(kernels_section.info())
PerfGraphSection "NonlinearSystemBase::Kernels":
Num calls: 15
Level: 3
Time (41.36%): Self 4.34 s, Children 0.00 s, Total 4.34 s
Memory (0.20%): Self 1 MB, Children 0 MB, Total 1 MB
Nodes:
- MooseTestApp (main)
MooseApp::run
MooseApp::execute
MooseApp::executeExecutioner
Steady::PicardSolve
FEProblem::solve
FEProblem::computeResidualSys
FEProblem::computeResidualInternal
FEProblem::computeResidualTags
NonlinearSystemBase::nl::computeResidualTags
NonlinearSystemBase::computeResidualInternal
NonlinearSystemBase::Kernels (14 call(s), 38.3% time, 0.0% memory)
- MooseTestApp (main)
MooseApp::run
MooseApp::execute
MooseApp::executeExecutioner
Steady::PicardSolve
FEProblem::solve
NonlinearSystemBase::nlInitialResidual
FEProblem::computeResidualSys
FEProblem::computeResidualInternal
FEProblem::computeResidualTags
NonlinearSystemBase::nl::computeResidualTags
NonlinearSystemBase::computeResidualInternal
NonlinearSystemBase::Kernels (1 call(s), 3.0% time, 0.2% memory)
From this, we can see that we had 14 residual solve evaluations that took 38.3% of the total run time and one residual evaluation on initial that took 3.0% of the total run time.
Reference
mooseutils.PerfGraphReporterReader
mooseutils.PerfGraphReporterReader(file, part=0)
A Reader for MOOSE PerfGraphReporterReader data.
Inputs:
file[str]: JSON file containing PerfGraphReporter data.
part[int]: Part of the JSON file to obtain when using "file".
The final timestep is used to capture the PerfGraph data.
heaviestNodes(num, memory=False)
Returns the heaviest nodes in the form of PerfGraphNode objects.
Inputs:
num[int]: The number of nodes to return.
memory[boolean]: Whether or not to sort by memory.
heaviestSections(num, memory=False)
Returns the heaviest sections in the form of PerfGraphSection objects.
Inputs:
num[int]: The number of sections to return.
memory[boolean]: Whether or not to sort by memory.
node(path)
Returns the node with the given path if one exists, otherwise None.
Inputs:
path[list]: Path to the node
recurse(act, *args, **kwargs)
Recursively do an action through the graph starting with the root node.
Inputs:
act[function]: Action to perform on each node (input: a PerfGraphNode)
rootNode()
Returns the root PerfGraphNode.
section(name)
Returns the PerfGraphSection with the given name if one exists, otherwise None.
Inputs:
name[str]: The name of the section.
sections()
Returns all of the named sections in a list of PerfGraphSection objects.
mooseutils.PerfGraphNode
mooseutils.PerfGraphNode(name, node_data, parent)
A node in the graph for the PerfGraphReporterReader. These should really only be constructed internally within the PerfGraphReporterReader.
Inputs:
name[str]: Section name for this node.
node_data[dict]: JSON output for this node.
parent[PerfGraphNode]: The parent to this node (None if root).
_sumAllNodes(do)
Internal method for summing across all nodes.
child(name)
Returns the child node with the given name, if one exists, otherwise None.
children()
Returns the nodes that are immediate children to this node.
childrenMemory()
Returns the memory added by children in Megabytes.
childrenTime()
Returns the time the children took in seconds.
info()
Returns the number of calls, the time, memory, and children in a human readable form.
level()
Returns the level assigned to the section.
name()
Returns the name assigned to the section.
numCalls()
Returns the number of times this was called.
parent()
Returns the node that is an immediate parent to this node (None if root).
path()
Returns the full path in the graph for this node.
percentMemory()
Returns the percentage of memory this this took relative to the total time of the root node.
percentTime()
Returns the percentage of time this took relative to the total time of the root node.
rootNode()
Returns the root (top node in the graph).
section()
Returns the PerfGraphSection that this node is in.
selfMemory()
Returns the memory added by only this (not including children) in Megabytes.
selfTime()
Returns the time only this (not including children) took in seconds.
totalMemory()
Returns the memory added by only this plus its children in Megabytes.
totalTime()
Returns the time this plus its children took in seconds.
mooseutils.PerfGraphSection
mooseutils.PerfGraphSection(name, level)
A section in the graph for the PerfGraphReporterReader. These should really only be constructed internally within the PerfGraphReporterReader.
Inputs:
name[str]: Section name for this node
node_data[dict]: JSON output for this node
parent[PerfGraphNode]: The parent to this node (None if root)
_sumAllNodes(do)
Internal method for summing across all nodes.
childrenMemory()
Returns the memory added by children in Megabytes.
childrenTime()
Returns the time the children took in seconds.
info()
Returns the number of calls, the time, and the memory in a human readable form.
level()
Returns the level assigned to the section.
name()
Returns the name assigned to the section.
node(path)
Returns the node with the given path, if one exists, otherwise None.
Inputs:
path[list]: Path in the graph to the node
nodes()
Returns the nodes that are in this section.
numCalls()
Returns the number of times this was called.
percentMemory()
Returns the percentage of memory this this took relative to the total time of the root node.
percentTime()
Returns the percentage of time this took relative to the total time of the root node.
rootNode()
Returns the root (top node in the graph).
selfMemory()
Returns the memory added by only this (not including children) in Megabytes.
selfTime()
Returns the time only this (not including children) took in seconds.
totalMemory()
Returns the memory added by only this plus its children in Megabytes.
totalTime()
Returns the time this plus its children took in seconds.