Skip to content

Software package management: modules

On most HPC clusters, software package are not accessible directly. You need to load module in order to gain access to certain software.

Motivation for modules

In order to explain why modules are needed on HPC cluster, we need to explain how the executables for the commands we type in the terminal are found. Obviously, the shell cannot do a search on the entire filesystem in order to find an executable with a name corresponding to the command, it would take too much time. Instead, only a handful of directories are searched. The search paths for the executables are stored in the PATH environment variable. By default the only path present in the search path is the one corresponding to the directories where most of the basic commands executables are located. We can see the content of this variable with the command:

echo $PATH
/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin

The paths to search are a list of directories separated by a :. Adding a path to a directory at the front or at the end of this list will make the shell search for executables in this directory:

export PATH=/my/new/directory:$PATH

The directories are searched in the order they appear in the list and the shell stop the search as soon as a matching executable is found.

Now, consider an HPC clusters, shared by a large number of users, each with their specific needs. One user may use an old version of a software package while another needs a newer version. If in the PATH variable, we have

/soft/package_name/version_new/bin:/soft/package_name/version_old/bin

then, as the executables name for these two versions are the same, the user that needs the old version will end up using the new one. It might not be what this user wants as he/she might be relying on a particular feature that was removed from the newer version. Also, for reproducibility reason researcher tends to stick to one particular version for an entire research project.

A solution for the user that needs the old version might be to invoke the executable using the full path:

/soft/package_name/version_old/bin/exec

but this solution is not really user-friendly as it requires to type quite a long command and remember where the executable is located.

Another problem is that, nowadays, most executable binaries are not statically linked, i.e., they load libraries at runtime. This means we have to make sure that the executable we want to use have access to the libraries on which it depends. In the same way as for an executable, libraries search paths are defined with an environment variable: LD_LIBRARY_PATH. The consequence is that id we want to use a particular version of a software package we have to set

  • in PATH, the path to the directory where our executable is located.
  • in LD_LIBRARY_PATH, all the paths to the directories where the libraries that are dependencies of our executable are located. In addition, we need to make sure that the dependencies of the dependencies are also available. For an executable with a complex dependencies structure, it can quickly become Unmanageable.

The problems presented above highlight the reason why environment modules were created. Modules allow for the dynamic modification of a user environment. With modules, a user can get access to software and switch between versions with ease, letting the module system take care of the search paths and of the environment in general.

Finding modules

All available modules can be listed using the module av command which will, on most HPC clusters produce quite a long list. You can navigate this list using the Up and Down keys or exiting the pager mode by pressing Q.

You can perform a more narrow search for available module by adding a keyword to search in the command. For example, to find modules related to Python:

  $ module av Python

----------------------------- Releases (2021b) ------------------------------
   IPython/7.26.0-GCCcore-11.2.0
   KAT/2.4.2-foss-2021b-Python-3.9.6
   Meep/1.22.0-foss-2021b-Python-3.9.6
   Python/3.9.6-GCCcore-11.2.0-bare
   Python/3.9.6-GCCcore-11.2.0           (D)
   flatbuffers-python/2.0-GCCcore-11.2.0
   pkgconfig/1.5.5-GCCcore-11.2.0-python
   protobuf-python/3.17.3-GCCcore-11.2.0

The result is a list of all module that have Python in their name as well as two version of Python it self: 3.9.6-GCCcore-11.2.0-bare and 3.9.6-GCCcore-11.2.0. The first is a Python with quite a few packages installed while the second is a minimal installation with no extra packages. When multiple module have the same name, the default module will be marked with a D. In the case of the Python module, 3.9.6-GCCcore-11.2.0 is the default

Module naming scheme

Modules on NIC5 uses the following naming scheme:

PACKAGE_NAME/PACKAGE_VERSION-TOOLCHAIN_NAME-TOOLCHAIN_VERSION

where

  • PACKAGE_NAME: the name of the software package
  • PACKAGE_VERSION: the version of the software package
  • TOOLCHAIN_NAME: the name of the toolchain (compiler) used to compile the package
  • TOOLCHAIN_VERSION: the version of the toolchain used to compile the package

Sometimes, like the Python example, variant of a same package can be installed and a -SUFFIX is added at the end of the name (-bare in the Python example).

This naming scheme originates from the tool we use to install most of the software on NIC5 (EasyBuild) but highlight a very important fact: most software on an HPC system are installed from source. The main reason is that in order get maximum performance, the software needs to be compiled with optimizations specific to the CPUs of NIC5.

Loading modules

Continuing with our Python example, we will now discuss how to load a module. Right after we login to NIC5, if we check the Python version, we can see that we have version 3.6.9 and that this Python is installed in /usr/bin.

 $ python --version
Python 3.6.8

 $ which python
/usr/bin/python
This version of Python is the one that comes with the operation system of NIC5 (Rocky Linux). We sometimes refer to this Python as the "System Python". Now, we will load a Python module to use another version of Python. This can be done with the module load command.

module load PACKAGE_NAME

where PACKAGE_NAME is the name of the software package we want to load. Knowing that, we can load the Python module with the following command

module load Python

Then, if we run the same commands as before to determine the Python version and where it is installed, we get

 $ python --version
Python 3.9.6

 $ which python
/opt/cecisw/arch/easybuild/2021b/software/Python/3.9.6-GCCcore-11.2.0/bin/python

We can see that we now have version 3.9.6 and that is is installed in a completly different location.

In our example, we did not specify the version of the module we wanted to load. As a result, the module default module has been loaded (Python/3.9.6-GCCcore-11.2.0). If we want the "bare" variant of this module, which is not the default, we need the explicitly provide the version when loading the module.

module load Python/3.9.6-GCCcore-11.2.0-bare

Listing loaded modules

Listing loaded module is done using the module list command. For example, if we continue the previous section example when we loaded the python module

 $ module list

Currently Loaded Modules:
  1) tis/2018.01                  (S)   9) libreadline/8.1-GCCcore-11.2.0
  2) releases/2021b               (S)  10) Tcl/8.6.11-GCCcore-11.2.0
  3) StdEnv                       (H)  11) SQLite/3.36-GCCcore-11.2.0
  4) GCCcore/11.2.0                    12) XZ/5.2.5-GCCcore-11.2.0
  5) zlib/1.2.11-GCCcore-11.2.0        13) GMP/6.2.1-GCCcore-11.2.0
  6) binutils/2.37-GCCcore-11.2.0      14) libffi/3.4.2-GCCcore-11.2.0
  7) bzip2/1.0.8-GCCcore-11.2.0        15) OpenSSL/1.1
  8) ncurses/6.2-GCCcore-11.2.0        16) Python/3.9.6-GCCcore-11.2.0

The first three modules in the list are modules that are loaded by default when you log in. All the other modules result from loading the Python module. As we can see, we did load more modules than just the Python module itself. These additional modules are dependencies, i.e., packages needed by Python at run time.

Unloading modules

To remove the Python module of our environment, we can use the module unload command

module unload Python

Then, if we check the effect of the command by listing the currently loaded modules

 $ module list

Currently Loaded Modules:
  1) tis/2018.01                  (S)   9) libreadline/8.1-GCCcore-11.2.0
  2) releases/2021b               (S)  10) Tcl/8.6.11-GCCcore-11.2.0
  3) StdEnv                       (H)  11) SQLite/3.36-GCCcore-11.2.0
  4) GCCcore/11.2.0                    12) XZ/5.2.5-GCCcore-11.2.0
  5) zlib/1.2.11-GCCcore-11.2.0        13) GMP/6.2.1-GCCcore-11.2.0
  6) binutils/2.37-GCCcore-11.2.0      14) libffi/3.4.2-GCCcore-11.2.0
  7) bzip2/1.0.8-GCCcore-11.2.0        15) OpenSSL/1.1
  8) ncurses/6.2-GCCcore-11.2.0

we see that, indeed, the Python module is not loaded in the environment but its dependencies are still loaded. This is by design. While the tool we use to install software on NIC5 allows the generation of modules that unload their dependencies, it might have undesired side effects. If two modules have the same dependencies, unloading the first module will lead to the dependency module to be unloaded and possibly break the functionality of the second module.

To remove all loaded modules, we can use the module purge command.

 $ module purge
The following modules were not unloaded:
  (Use "module --force purge" to unload all):

  1) releases/2021b   2) tis/2018.01

The module system informed us that two modules were not unloaded. These two modules are "sticky" modules, i.e., modules that should always be present in the environment. You can force the unloading of these modules using the module --force purge command but it's not recommended.

Summary

Command Description
module av List all available modules
module av PACKAGE_NAME List all modules with name matching PACKAGE_NAME
module load PACKAGE_NAME Load the a module with name PACKAGE_NAME
module unload PACKAGE_NAME Unload the a module with name PACKAGE_NAME
module list List loaded modules
module purge Unload all loaded modules