Apptainer

Inevitably, PRs will fail CIVET while they appear to pass on your own machine. Apptainer can help you reproduce the error, in an interactive fashion using the same container CIVET used when the error occurred.

Launch Container

First, we need the uniform resource identifier (URI) detailing what container the job is failing in. By capturing the first few lines from any CIVET step that failed (red box), the URI will be listed. As an example:


//: Running in versioned apptainer container moose-dev
BUILD_ROOT/moose/: Executing moosebuild in environment oras://mooseharbor.hpc.inl.gov/moose-dev/moose-dev-x86_64:4b79189
INFO:    Using cached SIF image

The above contains the URI we want:


oras://mooseharbor.hpc.inl.gov/moose-dev/moose-dev-x86_64:4b79189

With the URI known, launch an interactive shell using HPC OnDemand for any of the INL HPC cluster machine login nodes, and perform the following:


[~]> module load apptainer
[~]> apptainer shell oras://mooseharbor.hpc.inl.gov/moose-dev/moose-dev-x86_64:4b79189

[sha256:52660332a1977bb9cbb3e4aaf9b19f810669eba8579b8dda6193eba7b05cf359][~]>

(see Apptainer Features section below on how to control the lengthy prompt)

Troubleshoot

You are now operating from the same container which your PR is failing wihtin. By default, your home directory should be available. And by extension, your ~/cluster_name/projects directory containing your project.

schooltip

See Apptainer Features below if you are operating from your /scratch directory. As this will require an additional argument to apptainer in order to make available.

Next, simply put: Clone your application (if you have not); build it; run it; and test it. Do whatever it is that CIVET failed to do.

Troubleshooting Hints

If you are having difficulty reproducing the failure:

  • Be absolutely certain your project repo is clean. Also, it might be best to create a new clone of your project. A clone identical to how CIVET clones your project during the Fetch and Branch step.

  • Re-visit the failing step occurring in CIVET, and scrutinize with the utmost care all commands, arguments used, environment variables set, etc. One should even go so far as checking the steps that precede the failing step to determine if those results are cause for concern (cloning succeeded, but perhaps a submodule was not).

  • Perhaps CIVET was instructed to treat warnings as errors (make -Werror). Or perhaps CIVET is building and using a different method (METHOD=dbg make), etc.

  • Missing files not included in your PR (a missing git add).

If you're still unable to reproduce the error, please keep in mind one golden truth: Containers are immutable. The lack of reproducibility is being caused by something that is different in your environment compared to CIVET's. Things like network connectivity, shell flavor (tsch, zsh, bash), CPU microarchitecture (exceedingly rare), and the like.

commentnote:Many CIVET steps do not have network access

Beyond the initial cloning of the application, most steps are barred from network access. You can mimmic these behaviors if need be. See Apptainer Features below for details.

Apptainer Features

The following are only a few of the many available features Apptainer has to offer.

Paths and Mounts (e.g. your /scratch directory)

You can instruct Apptainer to mount any path you have access to, so that it is available while inside the container. On INL HPC machines, this is most useful if you enjoy operating in your /scratch directory. You can instruct Apptainer to mount this location by way of the -B argument:


apptainer shell -B /scratch oras://...

[moose-dev][~]> ls /scratch  # contents of scratch is displayed

Do you want to mount something somewhere else - perhaps also with read-only permissions?


apptainer shell -B /scratch:/somewhere_else:ro oras://...

[moose-dev][~]> ls /somewhere_else  # contents of somewhere_else which contains scratch is displayed
[moose-dev][~]> touch /somewhere_else/<your user id>/testing
touch: cannot touch '/somewhere_else/<your user id>/testing': Read-only file system

Multiple PATHs:


apptainer shell -B /scratch:/somewhere_else:ro,/projects oras://...

Inherited Environment Variables

If you wish to have some environment variable made available as the container passes control over to you, you only need to prefix your variable with the following Apptainer influential variable APPTAINERENV_, as so:


[~]> export APPTAINERENV_foo=bar
[~]> apptainer shell oras://...

[moose-dev][~]> echo $foo
bar

Extremely useful for obvious reasons, you can also use APPTAINERENV_ to control how the prompt is displayed:


[~]> export APPTAINERENV_APPTAINER_NAME=moose-dev
[~]> apptainer shell oras://mooseharbor.hpc.inl.gov/moose-dev/moose-dev-x86_64:4b79189

[moose-dev][~]>
schooltip:Custom Prompt

There is an all-encompasing $CUSTOM_PROMPT variable that allows you to pass your own prompt:


[~]> export APPTAINERENV_CUSTOM_PROMPT="\[\033[1;34m\][my-container]\[\033[1;32m\][\t]\[\033[0m\]> "
[~]> apptainer shell oras://mooseharbor.hpc.inl.gov/moose-dev/moose-dev-x86_64:4b79189

[my-container][12:17:43]>

(it is not possible to replicate color codes here in documentation, but they are being honored)

Helpful Arguments

The following highlights some of the more influential arguments CIVET employs during your PR.

[exec,shell] means to use exec or shell sub-command arguments.

ArgumentDescription
exec oras://... echo hello worldExecutes echo hello world in side the container, then exits
[exec,shell] --containallMinimalistic container (empty /dev, /tmp, $HOME)
[exec,shell] --network noneNo network