Container support¶
Containers provide an ideal way for abstracting execution resource heterogeneity and providing a common sandbox for execution.
There are two models for executing an app in a container:
- Workers are launched inside containers; a single container can be re-used for several apps.
- Each app is launched inside a fresh container.
This document describes the first case. In this model, the apps are executed on a worker that is launched within a container. For simplicity we focus on Docker although the same approach can be used with other supported container systems such as Singularity, Shifter etc.
Caution
This feature is available from Parsl v0.5.0
in an experimental
state.
We request feedback and feature enhancement requests via github.
Docker¶
The following section describes how to create a pool of Docker containers, each with a worker that executes specific apps.
Installing Docker¶
To install Docker please ensure you have sudo privileges and follow Docker’s installation instructions here.
Once installed make sure that Docker is installed:
# Get the Docker version
docker --version
# Get Docker info/stats
docker info
# Do a quick check with hello-world
docker run hello-world
Creating an image¶
Please note that the following instructions are tested on Ubuntu 16.04. If you are on a different operating system, the following instructions might need to be tweaked for your specific system. Such cases will be noted explicitly.
Pull a Docker image with the latest Python.
# Get a basic python image docker pull python
Construct a new Python image by creating a file called
Dockerfile
with the following contents. Every command in the container definition is assumed to be running in Ubuntu.# Use an official Python runtime as a parent image FROM python:3.6 # Set the working directory to /home WORKDIR /home # Install any needed packages specified in requirements.txt RUN pip3 install parsl
Once your updates are made, create a Docker image from the Dockerfile.
docker build -t parslbase_v0.1 .
Make sure your user has privileges to launch and manage Docker by adding yourself to the
docker
group. The following command assumes an Ubuntu machine.sudo usermod -a -G docker $USER
Ensure that you are running
Python3.6.X
. If you need another Python version, make sure that the container built in the previous steps matches the host machine’s environment.# This command should return Python 3.6 or higher. python3 -V
Set up Parsl apps. The following directories contain sample apps for this guide:
parsl/docker/app1
parsl/docker/app2
These container scripts are setup such that, when they are built they copy the application Python code over to
/home
, which will be thecwd
when app invocations are made. Each of theseappN.py
scripts contain the definition of apredict(List)
function.Build the test applications as Docker images: We assume you are in the top level of the Parsl repository.
# Docker build app1 cd docker/app1 docker build -t app1_v0.1 . # Docker build the next app cd ../app2 docker build -t app2_v0.1 . # Check the new images: docker images list
Parsl Config¶
Now that we have a Docker image available locally, we will create an executor
that
uses such an image to launch containers. Apps will execute in this environment.
Here is a Parsl configuration using one of the Docker images created in the previous section.
from parsl.config import Config
from parsl.executors.ipp import IPyParallelExecutor
from libsubmit.providers.local.local import Local
config = Config(
executors=[
IPyParallelExecutor(
label='pool_app1',
container_image='app1_v0.1'
provider=Local(init_blocks=2)
)
],
lazy_errors=True
)
For workflows with multiple apps which require different Docker images, a new executor should be
created for each of the images that will be used. In the Parsl workflow definition the app
decorator can then be tagged with the executors
keyword argument to ensure that apps execute
on the specific executors with the right container image.
Caution
If you have specific modules or python packages that are imported from relative paths, the workers in the container will not have these available unless explicitly copied in.
$ DOCKER_CWD=$(docker image inspect --format='{{{{.Config.WorkingDir}}}}' {2})
$ docker cp -a . $DOCKER_ID:$DOCKER_CWD
How this works¶
+-----local/Kubernetes/slurm... ---
|
+----- Parsl--------+ | +---------executor-1-------------+
| | | | ... |
| | | | +-------App1Container--------+ |
| App1(executors=['pool1'])------+-+--------app1.py | |
| | | | | +-----predict() | |
| X | | | +----------------------------+ |
| / \ | | +--------------------------------+
| Y...Y | |
| \ / | | +---------executor-2-------------+
| Z | | | ... |
| | | | +-------App2Container--------+ |
| App2(executors=['pool2'])------+-+------- app2.py | |
| | | | | +-----predict() | |
| | | | +----------------------------+ |
+-------------------+ | +--------------------------------+
|
+------------------- -- -
The diagram above illustrates the various components and how they interact with
each other to act as a fast model serving system. In this model, each executor in the Parsl
config definition can only serve one container image. Parsl launches multiple blocks
matching the definition of the executor, and each block will contain one container instantiated
with a worker running inside. In the examples given above, the worker is launched in the
working directory which also contains some application code:app1.py
.
The application codes app1.py
and app2.py
in our example Docker images, both
contain a simple python function predict()
that takes a list of numbers (floats/ints) applies
a simple arithmetic operation and returns a corresponding list.
Here is the contents of app1.py
:
def predict(list_items):
"""Returns the double of the items"""
return [i*2 for i in list_items]
A snippet of the Parsl code that imports the app1.py
file and calls predict()
on a executor
that specifies the right container image app1_v0.1
is below :
@python_app(executors=['pool_app1'], cache=True)
def app_1(data):
import app1
return app1.predict(data)
x = app_1([1,2,3])
# The print statement prints [2,4,6] once the results are available
print(x.result())