PerfGraphReporterReader

A python utility that provides an interface for reading PerfGraphReporter output. It rebuilds the graph for easy traversal via the PerfGraphNode and PerfGraphSection objects.

Example usage

Take the following simple diffusion problem, which has a PerfGraphReporter set to output on final:

[Mesh]
  [gmg]
    type = GeneratedMeshGenerator
    dim = 2
    nx = 1
    ny = 1
  []
[]

[Variables/u]
[]

[Kernels/diff]
  type = Diffusion
  variable = u
[]

[BCs]
  [left]
    type = DirichletBC
    variable = u
    boundary = left
    value = 0
  []
  [right]
    type = DirichletBC
    variable = u
    boundary = right
    value = 1
  []
[]

[Executioner]
  type = Steady
  solve_type = 'PJFNK'
  petsc_options_iname = '-pc_type -pc_hypre_type'
  petsc_options_value = 'hypre boomeramg'
[]

[Reporters/perf_graph]
  type = PerfGraphReporter
  execute_on = FINAL
[]

[Outputs/json]
  type = JSON
  execute_on = 'INITIAL FINAL'
[]
(test/tests/reporters/perf_graph_reporter/perf_graph_reporter.i)

For more real-world-like timing, we will pass the command line arguments "Mesh/gmg/nx=500 Mesh/gmg/ny=500", as the test above is executed with only a single element. With this, we will execute the following (where MOOSE_DIR is an environment variable set to the directory that contains MOOSE):

$MOOSE_DIR/test/moose_test-opt -i $MOOSE_DIR/test/tests/reporters/perf_graph_reporter/perf_graph_reporter.i Mesh/gmg/nx=500 Mesh/gmg/ny=500

This run will generate the desired output in $MOOSE_DIR/test/tests/reporters/perf_graph_reporter/perf_graph_reporter_json.json.

Load a PerfGraphReporterReader with the given output as such:

import os
from mooseutils.PerfGraphReporterReader import PerfGraphReporterReader
MOOSE_DIR = os.environ.get('MOOSE_DIR')
pgrr = PerfGraphReporterReader(MOOSE_DIR + '/test/tests/reporters/perf_graph_reporter/perf_graph_reporter_json.json')
commentnote

The examples that follow show detail by calling the info() method on each PerfGraphSection and PerfGraphNode. This is typically not the best way to report the data (you should likely use the member methods on said classes to access the data you need), but it is appropriate here for the purpose of showing usage. For more information on the member methods in said classes, see Reference for references to methods in each of the previous listed classes.

Heaviest sections

We can determine the five heaviest sections with:

for section in pgrr.heaviestSections(5):
    print(section)
PerfGraphSection "NonlinearSystemBase::Kernels"
PerfGraphSection "FEProblem::EquationSystems::Init"
PerfGraphSection "NonlinearSystemBase::computeJacobianTags"
PerfGraphSection "FEProblem::solve"
PerfGraphSection "MeshGeneratorMesh::cacheInfo"

Similarly, we can obtain more detailed output with:

for section in pgrr.heaviestSections(5):
    print(section.info())
PerfGraphSection "NonlinearSystemBase::Kernels":
  Num calls: 15
  Level: 3
  Time (41.36%): Self 4.34 s, Children 0.00 s, Total 4.34 s
  Memory (0.20%): Self 1 MB, Children 0 MB, Total 1 MB
  Nodes:
    - MooseTestApp (main)
       MooseApp::run
        MooseApp::execute
         MooseApp::executeExecutioner
          Steady::PicardSolve
           FEProblem::solve
            FEProblem::computeResidualSys
             FEProblem::computeResidualInternal
              FEProblem::computeResidualTags
               NonlinearSystemBase::nl::computeResidualTags
                NonlinearSystemBase::computeResidualInternal
                 NonlinearSystemBase::Kernels (14 call(s), 38.3% time, 0.0% memory)
    - MooseTestApp (main)
       MooseApp::run
        MooseApp::execute
         MooseApp::executeExecutioner
          Steady::PicardSolve
           FEProblem::solve
            NonlinearSystemBase::nlInitialResidual
             FEProblem::computeResidualSys
              FEProblem::computeResidualInternal
               FEProblem::computeResidualTags
                NonlinearSystemBase::nl::computeResidualTags
                 NonlinearSystemBase::computeResidualInternal
                  NonlinearSystemBase::Kernels (1 call(s), 3.0% time, 0.2% memory)
PerfGraphSection "FEProblem::EquationSystems::Init":
  Num calls: 1
  Level: 2
  Time (25.50%): Self 2.68 s, Children 0.00 s, Total 2.68 s
  Memory (17.59%): Self 86 MB, Children 0 MB, Total 86 MB
  Nodes:
    - MooseTestApp (main)
       MooseApp::run
        MooseApp::setup
         MooseApp::runInputFile
          Action::InitProblemAction::act
           FEProblem::init
            FEProblem::EquationSystems::Init (1 call(s), 25.5% time, 17.6% memory)
PerfGraphSection "NonlinearSystemBase::computeJacobianTags":
  Num calls: 2
  Level: 5
  Time (11.29%): Self 1.19 s, Children 0.00 s, Total 1.19 s
  Memory (4.50%): Self 22 MB, Children 0 MB, Total 22 MB
  Nodes:
    - MooseTestApp (main)
       MooseApp::run
        MooseApp::execute
         MooseApp::executeExecutioner
          Steady::PicardSolve
           FEProblem::solve
            FEProblem::computeJacobianInternal
             FEProblem::computeJacobianTags
              NonlinearSystemBase::computeJacobianTags (2 call(s), 11.3% time, 4.5% memory)
PerfGraphSection "FEProblem::solve":
  Num calls: 1
  Level: 1
  Time (60.96%): Self 0.85 s, Children 5.55 s, Total 6.40 s
  Memory (53.58%): Self 239 MB, Children 23 MB, Total 262 MB
  Nodes:
    - MooseTestApp (main)
       MooseApp::run
        MooseApp::execute
         MooseApp::executeExecutioner
          Steady::PicardSolve
           FEProblem::solve (1 call(s), 61.0% time, 53.6% memory)
PerfGraphSection "MeshGeneratorMesh::cacheInfo":
  Num calls: 2
  Level: 3
  Time (3.69%): Self 0.39 s, Children 0.00 s, Total 0.39 s
  Memory (5.11%): Self 25 MB, Children 0 MB, Total 25 MB
  Nodes:
    - MooseTestApp (main)
       MooseApp::run
        MooseApp::setup
         MooseApp::runInputFile
          Action::InitProblemAction::act
           FEProblem::init
            MeshGeneratorMesh::meshChanged
             MeshGeneratorMesh::update
              MeshGeneratorMesh::cacheInfo (1 call(s), 2.1% time, 0.2% memory)
    - MooseTestApp (main)
       MooseApp::run
        MooseApp::setup
         MooseApp::runInputFile
          Action::SetupMeshCompleteAction::Mesh::act
           Action::SetupMeshCompleteAction::Mesh::completeSetupUndisplaced
            MeshGeneratorMesh::prepare
             MeshGeneratorMesh::update
              MeshGeneratorMesh::cacheInfo (1 call(s), 1.6% time, 4.9% memory)

Section by name

Let's say we know that we want to look the timing of all kernel evaluations. With this, we're interested in the section NonlinearSystemBase::Kernels.

Obtain the section in question and show its information with:

kernels_section = pgrr.section('NonlinearSystemBase::Kernels')
print(kernels_section.info())
PerfGraphSection "NonlinearSystemBase::Kernels":
  Num calls: 15
  Level: 3
  Time (41.36%): Self 4.34 s, Children 0.00 s, Total 4.34 s
  Memory (0.20%): Self 1 MB, Children 0 MB, Total 1 MB
  Nodes:
    - MooseTestApp (main)
       MooseApp::run
        MooseApp::execute
         MooseApp::executeExecutioner
          Steady::PicardSolve
           FEProblem::solve
            FEProblem::computeResidualSys
             FEProblem::computeResidualInternal
              FEProblem::computeResidualTags
               NonlinearSystemBase::nl::computeResidualTags
                NonlinearSystemBase::computeResidualInternal
                 NonlinearSystemBase::Kernels (14 call(s), 38.3% time, 0.0% memory)
    - MooseTestApp (main)
       MooseApp::run
        MooseApp::execute
         MooseApp::executeExecutioner
          Steady::PicardSolve
           FEProblem::solve
            NonlinearSystemBase::nlInitialResidual
             FEProblem::computeResidualSys
              FEProblem::computeResidualInternal
               FEProblem::computeResidualTags
                NonlinearSystemBase::nl::computeResidualTags
                 NonlinearSystemBase::computeResidualInternal
                  NonlinearSystemBase::Kernels (1 call(s), 3.0% time, 0.2% memory)

From this, we can see that we had 14 residual solve evaluations that took 38.3% of the total run time and one residual evaluation on initial that took 3.0% of the total run time.

Reference

mooseutils.PerfGraphReporterReader

mooseutils.PerfGraphReporterReader(file, part=0)

A Reader for MOOSE PerfGraphReporterReader data.

Inputs:

  • file[str]: JSON file containing PerfGraphReporter data.

  • part[int]: Part of the JSON file to obtain when using "file".

The final timestep is used to capture the PerfGraph data.

heaviestNodes(num, memory=False)

Returns the heaviest nodes in the form of PerfGraphNode objects.

Inputs:

  • num[int]: The number of nodes to return.

  • memory[boolean]: Whether or not to sort by memory.

heaviestSections(num, memory=False)

Returns the heaviest sections in the form of PerfGraphSection objects.

Inputs:

  • num[int]: The number of sections to return.

  • memory[boolean]: Whether or not to sort by memory.

node(path)

Returns the node with the given path if one exists, otherwise None.

Inputs:

  • path[list]: Path to the node

recurse(act, *args, **kwargs)

Recursively do an action through the graph starting with the root node.

Inputs:

  • act[function]: Action to perform on each node (input: a PerfGraphNode)

rootNode()

Returns the root PerfGraphNode.

section(name)

Returns the PerfGraphSection with the given name if one exists, otherwise None.

Inputs:

  • name[str]: The name of the section.

sections()

Returns all of the named sections in a list of PerfGraphSection objects.

mooseutils.PerfGraphNode

mooseutils.PerfGraphNode(name, node_data, parent)

A node in the graph for the PerfGraphReporterReader. These should really only be constructed internally within the PerfGraphReporterReader.

Inputs:

  • name[str]: Section name for this node.

  • node_data[dict]: JSON output for this node.

  • parent[PerfGraphNode]: The parent to this node (None if root).

_sumAllNodes(do)

Internal method for summing across all nodes.

child(name)

Returns the child node with the given name, if one exists, otherwise None.

children()

Returns the nodes that are immediate children to this node.

childrenMemory()

Returns the memory added by children in Megabytes.

childrenTime()

Returns the time the children took in seconds.

info()

Returns the number of calls, the time, memory, and children in a human readable form.

level()

Returns the level assigned to the section.

name()

Returns the name assigned to the section.

numCalls()

Returns the number of times this was called.

parent()

Returns the node that is an immediate parent to this node (None if root).

path()

Returns the full path in the graph for this node.

percentMemory()

Returns the percentage of memory this this took relative to the total time of the root node.

percentTime()

Returns the percentage of time this took relative to the total time of the root node.

rootNode()

Returns the root (top node in the graph).

section()

Returns the PerfGraphSection that this node is in.

selfMemory()

Returns the memory added by only this (not including children) in Megabytes.

selfTime()

Returns the time only this (not including children) took in seconds.

totalMemory()

Returns the memory added by only this plus its children in Megabytes.

totalTime()

Returns the time this plus its children took in seconds.

mooseutils.PerfGraphSection

mooseutils.PerfGraphSection(name, level)

A section in the graph for the PerfGraphReporterReader. These should really only be constructed internally within the PerfGraphReporterReader.

Inputs:

  • name[str]: Section name for this node

  • node_data[dict]: JSON output for this node

  • parent[PerfGraphNode]: The parent to this node (None if root)

_sumAllNodes(do)

Internal method for summing across all nodes.

childrenMemory()

Returns the memory added by children in Megabytes.

childrenTime()

Returns the time the children took in seconds.

info()

Returns the number of calls, the time, and the memory in a human readable form.

level()

Returns the level assigned to the section.

name()

Returns the name assigned to the section.

node(path)

Returns the node with the given path, if one exists, otherwise None.

Inputs:

  • path[list]: Path in the graph to the node

nodes()

Returns the nodes that are in this section.

numCalls()

Returns the number of times this was called.

percentMemory()

Returns the percentage of memory this this took relative to the total time of the root node.

percentTime()

Returns the percentage of time this took relative to the total time of the root node.

rootNode()

Returns the root (top node in the graph).

selfMemory()

Returns the memory added by only this (not including children) in Megabytes.

selfTime()

Returns the time only this (not including children) took in seconds.

totalMemory()

Returns the memory added by only this plus its children in Megabytes.

totalTime()

Returns the time this plus its children took in seconds.