Troubleshooting

The content you find here, is a collection of issues commonly experienced by all of us at some point. Please use the navigation list on the right, to begin with the section you are experiencing issues with. This document has been design in such a manner that you will 'jump' between sections pertinent to fixing the previous one.

Mailing List

If your issue can not be solved here, please submit your question to our mailing list for help!

Modules

Modules allow users to control what libraries and binaries are being made available within that terminal session. It is worth mentioning that module commands, only affect the terminal they are in. It is not global. This is why we routinely ask users to operate in a single terminal while troubleshooting issues.

Users who have installed one of our moose-environment packages, will have access to modules. Please familiarize yourself with some commonly used module commands:

CommandCommand ArgUsage
modulelistList currently loaded modules
moduleavailList available modules
moduleload Load a space separated list of modules
modulepurgeRemove all loaded modules

To begin working with modules manually, it is best to start clean (especially for the duration of this FAQ). You can do so by purging any current modules loaded:


module purge

Load the two default modules pertaining to your operating system:

  • Linux:

    • module load moose-dev-gcc moose-tools

  • Macintosh:

    • module load moose-dev-clang moose-tools

Loading these two modules will in-turn load other necessary modules. The correct modules that should be loaded will approximately resemble the following list:

  • Linux:

    module list
    
    Currently Loaded Modulefiles:
    1) moose/.gcc-7.3.1                             5) moose/.cppunit-1.12.1_gcc-7.3.1
    2) moose/.mpich-3.2_gcc-7.3.1                   6) moose-dev-gcc
    3) moose/.petsc-__PETSC_DEFAULT___mpich-3.2_gcc-7.3.1-opt   7) miniconda
    4) moose/.tbb-2018_U3                           8) moose-tools
    

  • Macintosh:

    module list
    
    Currently Loaded Modulefiles:
    1) moose/.gcc-7.3.1                               6) moose/.cppunit-1.12.1_clang-6.0.1
    2) moose/.clang-6.0.1                             7) moose-dev-clang
    3) moose/.mpich-3.2_clang-6.0.1                   8) miniconda
    4) moose/.petsc-__PETSC_DEFAULT___mpich-3.2_clang-6.0.1-opt   9) moose-tools
    5) moose/.tbb-2018_U3
    

If your terminal mirrors the above (version numbers may vary slightly), then you have a proper environment. Please return from whence you came, and continue troubleshooting.

note:Ughh! None of this is working!

If you find yourself looping through our troubleshooting guide, unable to solve your issue, there is still another attempt you can perform. Start over. But this time, perform the following before starting over:

env -i bash
export PATH=/usr/bin:/usr/sbin:/sbin
source /opt/moose/Modules/3.2.10/init/bash

These three commands will start a new command interpreter without any of your default environment. This is important because for most errors we end up solving, it was due to something in the users environment.

Do note, if this ends up solving your issue, then there is something in one of possibly many bash profiles getting in the way. At this point, you will want to reach out to our mailing list and ask for help tracking this down. Keep in mind, depending on the situation you may be asked to contact the administrators of the machine in which you are operating on (HPC clusters for example are beyound our control).

note

The modules contained in the moose-environment package are built in a hierarchal directory structure (some modules may not be visible until other modules are loaded).

Compiling libMesh

Compiling libMesh requires a proper environment. Lets verify a few things before attempting to build it (or possibly re-build it in your case):

  • Verify you have a proper compiler present:

    • Linux:

      which $CC
      /opt/moose/mpich-3.2/gcc-7.3.1/bin/mpicc
      
      mpicc -show
      gcc -I/opt/moose/mpich-3.2/gcc-7.3.1/include -L/opt/moose/mpich-3.2/gcc-7.3.1/lib -Wl,-rpath -Wl,/opt/moose/mpich-3.2/gcc-7.3.1/lib -Wl,--enable-new-dtags -lmpi
      
      which gcc
      /opt/moose/gcc-7.3.1/bin/gcc
      

    • Macintosh:

      which $CC
      /opt/moose/mpich-3.2/clang-6.0.1/bin/mpicc
      
      mpicc -show
      clang -Wl,-commons,use_dylibs -I/opt/moose/mpich-3.2/clang-6.0.1/include -L/opt/moose/mpich-3.2/clang-6.0.1/lib -lmpi -lpmpi
      
      which clang
      /opt/moose/llvm-6.0.1/bin/clang
      

    What you are looking for is that which and mpicc -show are returning proper paths. If these paths are not set, or which is not returning anything, see Modules for help on setting up a proper environment. Once set up, return here and verify the above commands return the proper messages.

  • Check that PETSC_DIR is set and does exist:

    • Linux:

      echo $PETSC_DIR
      /opt/moose/petsc-__PETSC_DEFAULT__/mpich-3.2_gcc-7.3.1-opt
      
      file $PETSC_DIR
      /opt/moose/petsc-__PETSC_DEFAULT__/mpich-3.2_gcc-7.3.1-opt: directory
      

    • Macintosh:

      echo $PETSC_DIR
      /opt/moose/petsc-__PETSC_DEFAULT__/mpich-3.2_clang-6.0.1-opt
      
      file $PETSC_DIR
      /opt/moose/petsc-__PETSC_DEFAULT__/mpich-3.2_clang-6.0.1-opt: directory
      

    • If echo $PETSC_DIR returns nothing, this would indicate your environment is not complete. See Modules for help on setting up a proper environment. Once set up, return here and verify the above commands return the proper messages.

    • If file $PETSC_DIR returns an error (possible if you are performing a Manual Install), it would appear you have not yet ran configure. Configure builds this directory.

  • With the above all taken care of, try to build libMesh:

    
    cd moose/scripts
    ./update_and_rebuild_libmesh.sh
    

    If you encounter errors during this step, we would like to hear from you! Please seek help on our mailing list. Provide the diagnostic and libmesh configure logs. Those two files can be found in the following locations:

    • moose/libmesh/build/config.log

    • moose/scripts/libmesh_diagnostic.log

  • If libMesh built successfully, return to the beginning of the step that lead you here, and try that step again.

Build Issues

Build issues are normally caused by an invalid environment, or perhaps an update to your repository occurred, and you now have a mismatch between MOOSE and your application, or a combination of one or the other with libMesh's submodule.

  • Verify you have a functional Modules environment.

  • Verify the MOOSE repository is up to date, with the correct vetted version of libMesh:

    warning

    Before performing the following commands, be sure you have committed your work. Because... we are about to delete stuff!

    
    cd moose
    git checkout master
    git clean -xfd
    
    <output snipped>
    
    git fetch upstream
    git pull
    git submodule update --init
    

  • Verify you either have no moose directory set, or it is set correctly.

    
    [~] > echo $MOOSE_DIR
    
    [~] >
    

    The above should return nothing, or, it should point to the correct moose repository.

    note

    Most users, do not use or set MOOSE_DIR. If the above command returns something, and you are not sure why, just unset it:

    
    unset MOOSE_DIR
    

  • Try building a simple hello world example (copy and paste the entire box):

    
    cd /tmp
    cat << EOF > hello.C
    #include <mpi.h>
    #include <stdio.h>
    
    int main(int argc, char** argv) {
      // Initialize the MPI environment
      MPI_Init(NULL, NULL);
    
      // Get the number of processes
      int world_size;
      MPI_Comm_size(MPI_COMM_WORLD, &world_size);
    
      // Get the rank of the process
      int world_rank;
      MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
    
      // Get the name of the processor
      char processor_name[MPI_MAX_PROCESSOR_NAME];
      int name_len;
      MPI_Get_processor_name(processor_name, &name_len);
    
      // Print off a hello world message
      printf("Hello world from processor %s, rank %d out of %d processors\n",
             processor_name, world_rank, world_size);
    
      // Finalize the MPI environment.
      MPI_Finalize();
    }
    EOF
    
    mpicxx -fopenmp hello.C
    

    If the build failed, and you have the correct Modules environment loaded, then you should attempt to perform the 'Uggh! None of this is working' step in the Modules section. As it would seem, there is something else in your environment that is inhibiting your ability to compile simple programs.

    If the build was successfull, attempt to execute the hello word example:

    
    mpiexec -n 4 /tmp/a.out
    

    You should receive a response similar to the following:

    
    Hello world from processor my_hostname, rank 0 out of 4 processors
    Hello world from processor my_hostname, rank 1 out of 4 processors
    Hello world from processor my_hostname, rank 3 out of 4 processors
    Hello world from processor my_hostname, rank 2 out of 4 processors
    

  • If all of the above has succeeded, you should attempt to rebuild libMesh again.

Failing Tests

If many, or all tests are failing, it is a good chance the fix is simple. Follow through these steps to narrow down the possible cause.

First, run a test that should always pass:


cd moose/test
make -j 8
./run_tests -i always_ok -p 2
note:did make -j 8 fail?

If make -j 8 fails, please proceed to Build Issues above. This may also be the reason why all your tests are failing.

This test, proves the TestHarness is available. That libMesh is built, and the TestHarness has a working MOOSE framework available to it. Meaning, your test that is failing may be beyond the scope of this troubleshooting guide. However, do continue to read through the bolded situations below. If the error is not listed, please submit your failed test results to our mailing list for help.

If the test did fail, chances are your test and our test is failing for the same reason:

  • Environment Variables is somehow instructing the TestHarness to use improper paths. Try each of the following and re-run your test again. You may find you receive a different error each time. Simply continue troubleshooting using that new error, and work your way down. If the error is not listed here, then it is time to ask the mailing list for help:

    • check if echo $METHOD returns anything. If it does, try unsetting it with unset METHOD

      • If this was set to anything other than opt, it will be necessary to rebuild moose/test again:

        
        cd moose/test
        make -j 8
        

    • check if echo $MOOSE_DIR returns anything. If it does, try unsetting it with unset MOOSE_DIR

      • If this was set to anything, you must rebuild libMesh.

    • check if echo $PYTHONPATH returns anything. If it does, try unsetting it with unset PYTHONPATH

  • Failed to import hit:

    • Verify you have the miniconda package loaded. See Modules

      • If it was not loaded, and now it is, it may be necessary to re-build moose:

        
        cd moose/test
        make -j 8
        

  • No Modulefiles Currently Loaded

    • Verify you have modules loaded. See Modules

  • Application not found

    • Your Application has not yet been built. You need to successfully perform a make. If make is failing, please see Build Issues above.

    • Perhaps you have specified invalid arguments to run_tests? See TestHarness More Options. Specifically for help with:

      • --opt

      • --dbg

      • --oprof

  • gethostbyname failed, localhost (errno 3)

    • This is a fairly common occurrence which happens when your internal network stack / route, is not correctly configured for the local loopback device. Thankfully, there is an easy fix:

      • Obtain your hostname:

        
        hostname
        
        mycoolname
        

      • Linux & Macintosh : Add the results of hostname to your /etc/hosts file. Like so:

        
        sudo vi /etc/hosts
        
        127.0.0.1  localhost
        
        # The following lines are desirable for IPv6 capable hosts
        ::1        localhost ip6-localhost ip6-loopback
        ff02::1    ip6-allnodes
        ff02::2    ip6-allrouters
        
        127.0.0.1  mycoolname  # <--- add this line to the end of your hosts file
        

        Everyones host file is different. But the results of adding the necessary line described above will be the same.

      • Macintosh only, 2nd method:

        
        sudo scutil --set HostName mycoolname
        

        We have received reports where this method sometimes does not work.

  • TIMEOUT

    • If your tests fail due to timeout errors, its most likely you have a good installation, but a slow machine (or slow filesystem). You can adjust the amount of time that the TestHarness allows a test to run to completion by adding a paramater to your test file:

      
      [Tests]
        [./timeout]
          type = RunApp
          input = my_input_file.i
          max_time = 300   <-- time in seconds before a timeout occurs . 300 is the default for all tests.
        [../]
      []
      

  • CRASH

    • A crash indicates the TestHarness executed your application (correctly), but then your application exited with a non-zero return code. See Build Issues above for a possible solution.

  • EXODIFF

    • An exodiff indicates the TestHarness executed your application, and your application exited correctly. However, the generated results differs from the supplied gold file. If this test passes on some machines, and fails on others, this would indicate you may have applied too tight a tolerance to the acceptable error values for that specific machine. We call this phenomena machine noise.

  • CSVDIFF

    • A different file format following the same error checking paradigm as an exodiff test.