Binary Access with INL-HPC
The following information is provided for those with "HPC Binary Access" for an application and aims to get you up and running.
These instructions assume some comfort with running commands on a terminal. To aid in having the instructions on this page be clear, there is a convention used for the code blocks. Blocks with dark blue backgrounds are commands executed on a local machine and those with a black background are being executed on a remote INL machine.
# commands with a dark blue background are executed on your local machine
# commands with a black background are executed on a remote machine
This page provides various links to the internal INL-HPC website: https://hpcweb.hpc.inl.gov. To access this site you will need to have a browser setup with the correct settings, see SOCKS Proxy.
Login to INL-HPC
Setup is to connect to the INL-HPC, this is accomplished by first configuring your local machine (see SSH Config). This step only needs to be completed once for each machine you will be using to access INL-HPC resources.
Login into INL-HPC.
ssh <your hpc user id>@hpclogin.inl.gov
Connect to an INL-HPC Machine
When you login to "hpclogin" you will then need to connect to an HPC machine such as Sawtooth, for a complete list of machines please refer to the internal INL HPC website: hpcweb.hpc.inl.gov.
Assuming you want to work on Sawtooth, you can access it by logging into one of the two available login nodes: "sawtooth1" or "sawtooth2".
ssh sawtooth2
This will log you into the login node of a an HPC machine, you can verify that you connected to the machine by displaying the host name:
$ echo $HOSTNAME
sawtooth2
The login nodes, as the name suggests, are simply for accessing the machine and scheduling jobs or requesting an interactive job. Please do not run simulations on the login nodes, your account could be suspended if this occurs too often.
Load Application Environment
Next, it is necessary to load the desired application environment. For this example it is assumed that BISON is the desired application. Others include "GRIFFIN" and "BLUE_CRAB". The correct environment is loaded using the module command as follows.
module load use.moose moose-apps BISON
At this point you should be able to run the executable. For this example, the BISON executable help can be displayed by running the following command.
/apps/herd/bison/bison-opt --help
Create a Test Input File
To demonstrate the use of the binary and to verify that it is operating correctly, create a test input file. For example, the following input file models steady-state diffusion and works with any MOOSE-based application.
[Mesh]
type = GeneratedMesh
dim = 2
nx = 10
ny = 10
[]
[Variables]
[u]
[]
[]
[Kernels]
[diff]
type = Diffusion
variable = u
[]
[]
[BCs]
[left]
type = DirichletBC
variable = u
boundary = left
value = 0
[]
[right]
type = DirichletBC
variable = u
boundary = right
value = 1
[]
[]
[Executioner]
type = Steady
solve_type = 'PJFNK'
petsc_options_iname = '-pc_type -pc_hypre_type'
petsc_options_value = 'hypre boomeramg'
[]
[Outputs]
exodus = true
[]
(test/tests/kernels/simple_diffusion/simple_diffusion.i)This file can be created in your INL-HPC home directory or copied via scp from your local machine. In both cases, first create a location for the file. For example, the following creates a "testing" folder in your remote home directory.
mkdir ~/testing
If you want to create the file on the remote machine you will need to use a terminal editor to create the file such as "emacs" or "vi". Alternatively, you can create the file on a local machine and copy it to the remote, by running the following from your local machine, assuming you created a file "input.i" with the aforementioned content.
scp /path/to/the/local/input.i <your hpc user id>@hpclogin.inl.gov:~/testing/input.i
Running Test Simulation
(1) Start-up an interactive job
As mentioned above simulations should not be executed on the login nodes. For this example start an interactive job by requesting a single processor for one hour within the "moose" project queue.
qsub -I -l select=1:ncpus=1 -l walltime=01:00:00 -P moose
This command will likely take a few seconds or even minutes to execute depending the activity on the machine. When it completes you should see a message similar to the following.
qsub: waiting for job 141541.sawtoothpbs to start
qsub: job 141541.sawtoothpbs ready
Again, you can verify that you are on a compute node using the echo $HOSTNAME
command, which should give you a node name. For example, on Sawtooth an example node name is "r1i2n34".
When running in interactive mode you will also need to load the environment again.
module load use.moose moose-apps BISON
(2) Run the simulation
In the previous section a test input file was created in your INL-HPC home directory within the "testing" folder. MOOSE-based applications are designed to be executed from the location of the input file, so start by navigating to this location.
cd ~/testing
Next, execute the application with the "-i" argument followed by the input filename. For example, when using BISON with the created test input file.
/apps/herd/bison/bison-opt -i input.i
The available applications such as BISON will depend on being granted access to the specific application. Please refer to INL Applications and Remote Access for information about requesting access and the different levels available.
(3) Viewing Results
There are many ways to view results including options using the remote machine, please refer to the visualization systems linked on hpcweb.hpc.inl.gov/hardware.
However, in many cases it is desirable to view the results on your local machine. This is done by copying the content from the remote machine to your local machine using scp
. For example, to copy the output of this test run use the following command on your local machine.
scp <your hpc user id>@hpclogin.inl.gov:~/testing/input_out.e /path/to/the/local/destination
Scheduling Jobs with PBS
This section will provide you with a basic example of using Portable Batch System (PBS) for scheduling jobs to run in INL resources. For detailed information regarding using PBS please visit the [INL-HPC PBS website](https://hpcweb.hpc.inl.gov/home/pbs).
In general, scheduling jobs requires two steps:
create a PBS script that describes the required resources and the commands to execute and
submit the script to the scheduler.
(1) Create PBS Script
Using the example problem from above a simple script for scheduling may be created. The top portion of the script provides PBS directives that are passed to scheduler. Notice that these lines are simply the arguments that are being passed to the scheduler, as was done via the command line in the interactive job above.
#!/bin/bash
#PBS -N test_run
#PBS -l select=1:ncpus=48:mpiprocs=48
#PBS -l walltime=5:00
#PBS -P moose
cd $PBS_O_WORKDIR
source /etc/profile.d/modules.sh
module load use.moose moose-apps PETSc BISON
mpirun /apps/herd/bison/bison-opt -i input.i Mesh/uniform_refine=2
It is recommended that this script be created in the same location as input file(s) it will execute. This will ensure that the associated output from the simulation ends up in the same location. For this example the script is located in the testing folder: ~/testing/test.sh
.
The PBS directives for this script include:
-N test_run
to set the job name;-l select=1:ncpus=48:mpiprocs=48
indicates to utilize one compute node (select=1
), utilizing all 48 cores and running with 48 mpi processes;-l walltime=5:00
allocates 5 minutes of time for the simulation; and-P moose
dictates that the job should part of the MOOSE queue. For a complete list of queues available see HPC PBS page.
Sawtooth has 48 processors per compute nodes, the other machines will have different numbers. Please refer to the hpcweb.hpc.inl.gov for a list of the available systems.
The second portion provides the commands to be executed. The cd $PBS_O_WORKDIR
changes to the directory that was used when the job was submitted (the environment variables is defined with the leter "O"). The commands that follow enable the "module" command and then load the required modules for the problem. Finally, the "mpirun" command is followed by the application executable and the input file to execute. This example file is very simple, so the mesh is refined extensively to make the problem a bit more suited for 48 cores.
(2) Submit Script
Submitting the job to the scheduler is trivial. First, be sure to log in to one the desired machines login nodes (e.g., ssh sawtooth2
). Change directory to the location of the script and associated input files and then use the "qsub" command.
cd ~testing
qsub test.sh
Running this command will return the job id (e.g., 141861.sawtoothpbs
). The qstat
command can be used to check the status of the jobs.
$ qstat -u <your hpc user id>
sawtoothpbs:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
141861.sawtooth slauae short test_run -- 2 192 725gb 00:05 Q --
For more information about monitoring job status please refer to the INL Job Status page for more information on the qstat
command. There also are comprehensive status websites for each machine: hpcweb.hpc.inl.gov/status.
When the job completes the expected output files will be available in the same location as the input file (~/testing
). Refer to (3) Viewing Results in the previous example for more information regarding visualization.