2.2.1. ExaWorks SDK Container Image

The ExaWorks SDK is available in a containerized format on dockerhub. This docker image is a great place to start if you want to get familiar with the workflow tools comprising the SDK without the overhead of a full install.

2.2.1.1. Preparing the Conatiner Environment

As we will be executing this tutorial within a container, we must first ensure that the docker daemon is running. This is sytem dependent so see documentation for your specific system. If you wish to run this note book directly, note that it does require a bash Kernal for jupyter. You can install a bash Kernal into your python virtual environment by running:

pip install bash_kernel
python -m bash_kernel.install

2.2.1.2. Running the ExaWorks Container

After preparing your enviromnment, you can pull the SDK Image.

[19]:
docker pull exaworks/sdk
Using default tag: latest
latest: Pulling from exaworks/sdk
Digest: sha256:f278e43866f4e1a1da9b7d0d98f433ca88e0a598c504c2f7d3831690195d64a4
Status: Image is up to date for exaworks/sdk:latest
docker.io/exaworks/sdk:latest

After pulling the image, you can run arbitrary commands within the container.

Note that in this tutorial, we often run each command as the argument for the docker run. This means that no progress or state is saved between commands because the command is run on a new container based on the SDK image everytime. We use the --login flag because a lot of the environment is initialiazed through the .bashrc. Without that flag many of the packages would not work. This tutorial uses the docker run because of the nature of the jupyter notebook running it, and is some instances code snippets will be placed in the Markdown sections to show more complicated actions.

It is recommended that instead of using docker run for every command, that you generate an interactive docker run using :

docker run -it exaworks/sdk bash

This will give you a shell within the container the ecexute all of the commands that fall under the -c <command> flag.

[17]:
echo Flux Version:
docker run -t exaworks/sdk bash --login -c "flux -V"
echo

echo Parsl Version:
docker run -t exaworks/sdk bash --login -c "python -c 'import parsl; print(parsl.__version__)'"
echo

echo Radical Pilot Version:
docker run -t exaworks/sdk bash --login -c "radical-pilot-version"
echo

echo Swift-t Version:
docker run -t exaworks/sdk bash --login -c "swift-t -v"
Flux Version:
commands:               0.28.0
libflux-core:           0.28.0
build-options:          +hwloc==1.11.0

Parsl Version:
1.3.0-dev

Radical Pilot Version:
1.11.2

Swift-t Version:
STC: Swift-Turbine Compiler 0.9.0
         for Turbine: 1.3.0
Using Java VM:    /usr/bin/java
Using Turbine in: /opt/swift-t/turbine

Turbine 1.3.0
 installed:    /opt/swift-t/turbine
 source:       /tmp/build-swift-t/swift-t/turbine/code
 using CC:     /usr/local/bin/mpicc
 using MPI:    /usr/local/lib mpi "OpenMPI"
 using Tcl:    /opt/tcl-8.6.11/bin/tclsh8.6

2.2.1.3. Running the Tests

Each workflow tool has a set of tests located at /tests/<packagename>/test.sh.

[1]:
echo Flux Tests:
docker run -t exaworks/sdk bash --login -c "bash /tests/flux/test.sh" | head -n 7
echo "..."
Flux Tests:
Cloning into 'flux-core'...
remote: Enumerating objects: 90454, done.
remote: Counting objects: 100% (7455/7455), done.
remote: Compressing objects: 100% (2720/2720), done.
remote: Total 90454 (delta 5113), reused 6693 (delta 4716), pack-reused 82999
Receiving objects: 100% (90454/90454), 40.30 MiB | 13.47 MiB/s, done.
Resolving deltas: 100% (67270/67270), done.
write /dev/stdout: broken pipe
...
[21]:
echo Parsl Tests:
docker run -t exaworks/sdk bash --login -c "bash /tests/parsl/test.sh"
Parsl Tests:
Hello World from Python!
Hello World!

Output matches
[1]:
echo Radical Pilot Tests:
docker run -t exaworks/sdk bash --login -c "bash /tests/rp/test.sh" | head -n 7
echo "..."
Radical Pilot Tests:
--- start MongoDB
about to fork child process, waiting until server is ready for connections.
forked process: 26
child process started successfully, parent exiting
--- smoke test

================================================================================
write /dev/stdout: broken pipe
...
[2]:
echo Swift-t Tests:
docker run -t exaworks/sdk bash --login -c "bash /tests/swift/test.sh" | head -n 7
echo "..."
Swift-t Tests:
+ [[ openmpi == \o\p\e\n\m\p\i ]]
+ export TURBINE_LAUNCH_OPTIONS=--allow-run-as-root
+ TURBINE_LAUNCH_OPTIONS=--allow-run-as-root
+ swift-t -v
STC: Swift-Turbine Compiler 0.9.0
         for Turbine: 1.3.0
Using Java VM:    /usr/bin/java
write /dev/stdout: broken pipe
...

2.2.1.4. Running the Tutorial Notebooks

As of now, jupyter is not automatically included in the SDK container image, but we can easily install it! First, we have to run our container while exposing a port and mounting the directory that contains jupyter notebooks. The note books are not currently a part of the container image, so we need to make them accessible from within the conatiner using the -v flag. We also need to specify that we want the jupyter server to resolve on the local host at the default jupyter port. We do this by mapping the port from the host machine to the contianer with -p 8888:8888, and specifying the localhost ip when starting the jupyter server.

$ docker run -p 8888:8888 -v $(path/to/notebooks):/notebooks -it exaworks/sdk bash

You can then install and run jupyter.

# pip install jupyter
# cd /notebooks
# jupyter notebook --allow-root --ip 0.0.0.0 --no-browser

Then just copy the URL to your browser to view and run the notebooks. The other notebooks may have some additional prerequisites and configuration required before they can be run.

2.2.1.5. SDK Image Tags

As a part of our CI/CD pipeline, we build the SDK with multiple build parameters, including different base operating systems, python versions, mpi flavors, and package managers. To organize these different builds, we use tags to distinguish them. When selecting an image, you can select a specific tag for a specific build spec that you want to test. The tag works as follows: <os>_<package_manager>_<mpi_flavor>_<pyhton_version>. Different tags can be seen here.

[5]:
docker pull exaworks/sdk:ubuntu2004_pip_openmpi_3.8
docker run -t exaworks/sdk:ubuntu2004_pip_openmpi_3.8 bash --login -c "python -V"
ubuntu2004_pip_openmpi_3.8: Pulling from exaworks/sdk
Digest: sha256:86dee9aaa13aa21715b2035945307220e560fc0141d7a08166f7bbcc4257fbed
Status: Image is up to date for exaworks/sdk:ubuntu2004_pip_openmpi_3.8
docker.io/exaworks/sdk:ubuntu2004_pip_openmpi_3.8
Python 3.8.10

2.2.1.6. SDK Base Image

When building the SDK container image, we first create a minimum build base image the contains all of the dependencies for the sdk. This base image can be a great start if you want to work through building the rest of the SDK manually or just a subset of the packages. The base image can be found here. The base imge also follows the same tagging conventions as the full SDK image

[7]:
docker pull exaworks/sdk-base
Using default tag: latest
latest: Pulling from exaworks/sdk-base
Digest: sha256:a40f6220a540b9e1e80250b0cdcc88503a9324d86f5db64102f5bb1dd2e9de9b
Status: Image is up to date for exaworks/sdk-base:latest
docker.io/exaworks/sdk-base:latest

2.2.1.7. Development on the SDK Container Image

The ExaWroks SDK is an opensource project, and we encourage community engagment and development on it. This includes development on the SDK container image. Be sure to checkoout our contribution guidelines and best practices before makeing changes!

2.2.1.8. An Overview of the Build Process

2.2.1.8.1. Base Image

As mentioned above, the first stop in the build process is to create a minimal build base image with all of the dependencies for the SDK. This is currently split into three different possible base dockerfiles, one for rockylinux8, on for centos7, and one for ubuntu20.04. Each of these docker files uses a combination of the os specific package manager along with a set of shared build scripts to install the dependencies.

The base image is where are the different build parameters are specified. While the os determines which dockerfile the image is built from, the other build parameters are passed in during the build process. While the goal of the build parameters is to create a large build matrix where we can test all combinations of environments in our CI pipline, several of the combinations still fail to build. Development in this area could be towards fixing the build for some combinations of build parameters or by adding new ones.

2.2.1.8.1.1. Build Parameters

  1. Operating System : centos7, rockylinux8, ubuntu20.04

    See the SDK repo under docker/base/<os>

  2. Package Manager: pip, conda

    See install-python-env.sh

  3. MPI Flavor: opmenmpi, mpich

    See install-mpi.sh

  4. Python Version: 3.7, 3.8, 3.9

    See install-python.sh or if conda, see install-python-env.sh

2.2.1.8.2. Workflow Tool Images

Each workflow tool is installed using its own dockerfile and any additional build scripts. Each one has an argument for base image, which sets the FROM line in the dockerfile. Development in this area would be to expand the tests for a specific workflow tool or to add a new tool to the SDK image.

2.2.1.8.2.1. Testing

Each workflow tool has its own set of tests, which are added to the SDK Image under /tests/<package>/ and are inititiated by a test.sh in that directory. These tests give the code teams key insights on where bugs or failures might exist in their codesbases and how to fix them. Our CI pipeline runs these tests then exports the data to our testing dashbaord. These tests range from full unit and integration tests to simple sanity checks, and more additions or use cases are always welcome.

2.2.1.8.2.2. Adding a New Worflow Tool

We are encouraging community engagement and wish to expand the ExaWorks SDK with new workflow tools. To do so, we also need to expand the SDK Image. We do this by adding a new directory under docker in the SDK repo for the dockerfile and any related build scripts. All the of specifc images should be able to be built directly from the SDK Base Image or from any other SDK image. We use the build argument of BASE_IMAGE to set which SDK image we are building from.

ARG BASE_IMAGE=exaworks/sdk-base
FROM ${BASE_IMAGE}

Aside from just adding the build files for the new tool, be sure to add in tests as well!

2.2.1.8.2.3. Updating the CI Pipline

After adding a new tool to the SDK, also be sure to update the CI pipeline to include builds for that new workflow tool. This can be done by editing the ci.yml under the build and tests stages. During the build stage, we add new worflow tools one at a time and update the tag with the new tool being added. For Example:

docker build \
          -t rp_parsl_swift_flux:${{ env.DOCKER_TAG }} \
          --build-arg BASE_IMAGE=rp_parsl_swift:${{ env.DOCKER_TAG }} \
          docker/flux

You can see that in this part of the build process, we have already added Radical Pilot, Parsl, and Swift-t to the SDK Image, and we are currently adding in flux. The ${{ env.DOCKER_TAG }} represents the combination of build arguments from the base image. Be sure to add any new image builds before the last one containing the integration and to update the base image for the integration build. To update the tests, simply add in the new tool in for loop.

for core in flux parsl rp swift flux-parsl <new-tool>
        do
          ...
        done

When all changes appear to pass in the ci.yml, apply those same changes to the build process in deploy.yml.

[ ]: