Configuration

Parsl workflows are developed completely independently from their execution environment. Parsl offers an extensible configuration model through which the execution environment and communication with that environment is configured. Parsl is configured using Config object. For more information, see the Config class documentation. The following shows how the configuration can be loaded.

import parsl
from parsl.config import Config
from parsl.executors.threads import ThreadPoolExecutor

config = Config(
    executors=[ThreadPoolExecutor()],
    lazy_errors=True
)
parsl.load(config)

Note

Please note that all configuration examples below import a user_opts file where all user specific options are defined. To use the configuration, these options must be defined either by creating a user_opts file, or explicitly edit the configuration with user specific information.

Comet (SDSC)

https://ucsdnews.ucsd.edu/news_uploads/comet-logo.jpg

The following snippet shows an example configuration for executing remotely on San Diego Supercomputer Center’s Comet supercomputer. The example uses an SSHChannel to connect remotely to Comet, the SlurmProvider to interface with the Slurm scheduler used by Comet and the SrunLauncher to launch workers.

from libsubmit.channels import SSHChannel
from libsubmit.providers import SlurmProvider
from libsubmit.launchers import SrunLauncher

from parsl.config import Config
from parsl.executors.ipp import IPyParallelExecutor
from parsl.executors.ipp_controller import Controller
from parsl.tests.utils import get_rundir

# If you are a developer running tests, make sure to update parsl/tests/configs/user_opts.py
# If you are a user copying-and-pasting this as an example, make sure to either
#       1) create a local `user_opts.py`, or
#       2) delete the user_opts import below and replace all appearances of `user_opts` with the literal value
#          (i.e., user_opts['swan']['username'] -> 'your_username')
from .user_opts import user_opts

config = Config(
    executors=[
        IPyParallelExecutor(
            label='comet_ipp_multinode',
            provider=SlurmProvider(
                'debug',
                channel=SSHChannel(
                    hostname='comet.sdsc.xsede.org',
                    username=user_opts['comet']['username'],
                    script_dir=user_opts['comet']['script_dir']
                ),
                launcher=SrunLauncher(),
                overrides=user_opts['comet']['overrides'],
                walltime="00:10:00",
                init_blocks=1,
                max_blocks=1,
                nodes_per_block=2,
                tasks_per_node=1,
            ),
            controller=Controller(public_ip=user_opts['public_ip']),
        )

    ],
    run_dir=get_rundir()
)

Cori (NERSC)

https://6lli539m39y3hpkelqsm3c2fg-wpengine.netdna-ssl.com/wp-content/uploads/2017/08/Cori-NERSC.png

The following snippet shows an example configuration for accessing NERSC’s Cori supercomputer. This example uses the IPythonParallel executor and connects to Cori’s Slurm scheduler. It uses a remote SSH channel that allows the IPythonParallel controller to be hosted on the script’s submission machine (e.g., a PC). It is configured to request 2 nodes configured with 1 TaskBlock per node. Finally it includes override information to request a particular node type (Haswell) and to configure a specific Python environment on the worker nodes using Anaconda.

"""
    Block {Min:0, init:1, Max:1}
====================================
| ++++++++++++++ || ++++++++++++++ |
| |    Node    | || |    Node    | |
| |            | || |            | |
| | Task  Task | || | Task  Task | |
| |            | || |            | |
| ++++++++++++++ || ++++++++++++++ |
====================================
"""
from libsubmit.providers import SlurmProvider
from libsubmit.channels import SSHChannel
from libsubmit.launchers import SrunLauncher

from parsl.config import Config
from parsl.executors.ipp import IPyParallelExecutor
from parsl.executors.ipp_controller import Controller
from parsl.tests.utils import get_rundir

# If you are a developer running tests, make sure to update parsl/tests/configs/user_opts.py
# If you are a user copying-and-pasting this as an example, make sure to either
#       1) create a local `user_opts.py`, or
#       2) delete the user_opts import below and replace all appearances of `user_opts` with the literal value
#          (i.e., user_opts['swan']['username'] -> 'your_username')
from .user_opts import user_opts

config = Config(
    executors=[
        IPyParallelExecutor(
            label='cori_ipp_multinode',
            provider=SlurmProvider(
                'debug',
                channel=SSHChannel(
                    hostname='cori.nersc.gov',
                    username=user_opts['cori']['username'],
                    script_dir=user_opts['cori']['script_dir']
                ),
                nodes_per_block=2,
                tasks_per_node=2,
                init_blocks=1,
                max_blocks=1,
                overrides=user_opts['cori']['overrides'],
                launcher=SrunLauncher,
            ),
            controller=Controller(public_ip=user_opts['public_ip']),
        )
    ],
    run_dir=get_rundir(),
)

Theta (ALCF)

https://www.alcf.anl.gov/files/ALCF-Theta_111016-1000px.jpg

The following snippet shows an example configuration for executing on Argonne Leadership Computing Facility’s Theta supercomputer. This example uses the IPythonParallel executor and connects to Theta’s Cobalt scheduler using the CobaltProvider. This configuration assumes that the script is being executed on the login nodes of Theta.

from libsubmit.providers import CobaltProvider
from libsubmit.launchers import AprunLauncher

from parsl.config import Config
from parsl.executors.ipp import IPyParallelExecutor
from parsl.executors.ipp_controller import Controller
from parsl.tests.utils import get_rundir

# If you are a developer running tests, make sure to update parsl/tests/configs/user_opts.py
# If you are a user copying-and-pasting this as an example, make sure to either
#       1) create a local `user_opts.py`, or
#       2) delete the user_opts import below and replace all appearances of `user_opts` with the literal value
#          (i.e., user_opts['swan']['username'] -> 'your_username')
from .user_opts import user_opts

config = Config(
    executors=[
        IPyParallelExecutor(
            label='theta_local_ipp_multinode',
            provider=CobaltProvider(
                queue="debug-flat-quad",
                launcher=AprunLauncher(),
                walltime="00:30:00",
                nodes_per_block=2,
                tasks_per_node=1,
                init_blocks=1,
                max_blocks=1,
                overrides=user_opts['theta']['overrides'],
                account=user_opts['theta']['account'],
                cmd_timeout=60
            ),
            controller=Controller(public_ip=user_opts['public_ip'])
        )

    ],
    run_dir=get_rundir(),

)

Cooley (ALCF)

https://today.anl.gov/wp-content/uploads/sites/44/2015/06/Cray-Cooley.jpg

The following snippet shows an example configuration for executing remotely on Argonne Leadership Computing Facility’s Cooley analysis and visualization system. The example uses an SSHInteractiveLoginChannel to connect remotely to Cooley using ALCF’s 2FA token. The configuration uses the CobaltProvider to interface with Cooley’s scheduler.

# Untested
from libsubmit.channels import SSHInteractiveLoginChannel
from libsubmit.providers import CobaltProvider
from libsubmit.launchers import SingleNodeLauncher

from parsl.config import Config
from parsl.executors.ipp import IPyParallelExecutor
from parsl.executors.ipp_controller import Controller
from parsl.tests.utils import get_rundir

# If you are a developer running tests, make sure to update parsl/tests/configs/user_opts.py
# If you are a user copying-and-pasting this as an example, make sure to either
#       1) create a local `user_opts.py`, or
#       2) delete the user_opts import below and replace all appearances of `user_opts` with the literal value
#          (i.e., user_opts['swan']['username'] -> 'your_username')
from .user_opts import user_opts

config = Config(
    executors=[
        IPyParallelExecutor(
            label='cooley_ssh_il_local_single_node',
            provider=CobaltProvider(
                channel=SSHInteractiveLoginChannel(
                    hostname='cooley.alcf.anl.gov',
                    username=user_opts['cooley']['username'],
                    script_dir="/home/{}/parsl_scripts/".format(user_opts['cooley']['username'])
                ),
                nodes_per_block=1,
                tasks_per_node=1,
                init_blocks=1,
                max_blocks=1,
                walltime="00:05:00",
                overrides=user_opts['cooley']['overrides'],
                queue='pubnet-debug',
                account=user_opts['cooley']['account'],
                launcher=SingleNodeLauncher(),
            ),
            controller=Controller(public_ip=user_opts['public_ip'])
        )

    ],
    run_dir=get_rundir(),
)

Swan (Cray)

https://www.cray.com/blog/wp-content/uploads/2016/11/XC50-feat-blog.jpg

The following snippet shows an example configuration for executing remotely on Swan, an XC50 machine hosted by the Cray Partner Network. The example uses an SSHChannel to connect remotely Swan, uses the TorqueProvider to interface with the scheduler and the AprunLauncher to launch workers on the machine

"""
    Block
====================================
| ++++++++++++++ || ++++++++++++++ |
| |    Node    | || |    Node    | |
| |            | || |            | |
| | Task  Task | || | Task  Task | |
| |            | || |            | |
| ++++++++++++++ || ++++++++++++++ |
====================================
"""
from libsubmit.channels import SSHChannel
from libsubmit.launchers import AprunLauncher
from libsubmit.providers import TorqueProvider

from parsl.config import Config
from parsl.executors.ipp import IPyParallelExecutor
from parsl.executors.ipp_controller import Controller

from parsl.tests.utils import get_rundir

# If you are a developer running tests, make sure to update parsl/tests/configs/user_opts.py
# If you are a user copying-and-pasting this as an example, make sure to either
#       1) create a local `user_opts.py`, or
#       2) delete the user_opts import below and replace all appearances of `user_opts` with the literal value
#          (i.e., user_opts['swan']['username'] -> 'your_username')
from .user_opts import user_opts

config = Config(
    executors=[
        IPyParallelExecutor(
            label='swan_ipp',
            provider=TorqueProvider(
                channel=SSHChannel(
                    hostname='swan.cray.com',
                    username=user_opts['swan']['username'],
                    script_dir=user_opts['swan']['script_dir'],
                ),
                nodes_per_block=2,
                tasks_per_node=2,
                init_blocks=1,
                max_blocks=1,
                launcher=AprunLauncher(),
                overrides=user_opts['swan']['overrides']
            ),
            controller=Controller(public_ip=user_opts['public_ip']),
        )

    ],
    run_dir=get_rundir()
)

CC-IN2P3

https://cc.in2p3.fr/wp-content/uploads/2017/03/bandeau_accueil.jpg

The snippet below shows an example configuration for executing from a login node on IN2P3’s Computing Centre. The configuration uses the LocalProvider to run on a login node primarily to avoid GSISSH, which Parsl does not support yet. This system uses Grid Engine which Parsl interfaces with using the GridEngineProvider.

"""
================== Block
| ++++++++++++++ | Node
| |            | |
| |    Task    | |             . . .
| |            | |
| ++++++++++++++ |
==================
"""
from libsubmit.channels import LocalChannel
from libsubmit.providers import GridEngineProvider
from parsl.config import Config
from parsl.executors.ipp import IPyParallelExecutor
from parsl.tests.utils import get_rundir

# If you are a developer running tests, make sure to update parsl/tests/configs/user_opts.py
# If you are a user copying-and-pasting this as an example, make sure to either
#       1) create a local `user_opts.py`, or
#       2) delete the user_opts import below and replace all appearances of `user_opts` with the literal value
#          (i.e., user_opts['swan']['username'] -> 'your_username')
from .user_opts import user_opts

config = Config(
    executors=[
        IPyParallelExecutor(
            label='cc_in2p3_local_single_node',
            provider=GridEngineProvider(
                channel=LocalChannel(),
                nodes_per_block=1,
                tasks_per_node=1,
                init_blocks=1,
                max_blocks=1,
                walltime="00:20:00",
                overrides=user_opts['cc_in2p3']['overrides'],
            ),
            engine_debug_level='DEBUG',
        )

    ],
    run_dir=get_rundir()
)

Midway (RCC, UChicago)

https://rcc.uchicago.edu/sites/rcc.uchicago.edu/files/styles/slideshow-image/public/uploads/images/slideshows/20140430_RCC_8978.jpg?itok=BmRuJ-wq

This Midway cluster is a campus cluster hosted by the Research Computing Center at the University of Chicago. The snippet below shows an example configuration for executing remotely on Midway. The configuration uses the SSHProvider to connect remotely to Midway, uses the SlurmProvider to interface with the scheduler, and uses the SrunProvider to launch workers.

from libsubmit.channels import SSHChannel
from libsubmit.providers import SlurmProvider
from libsubmit.launchers import SrunLauncher

from parsl.config import Config
from parsl.executors.ipp import IPyParallelExecutor
from parsl.executors.ipp_controller import Controller
from parsl.tests.utils import get_rundir

# If you are a developer running tests, make sure to update parsl/tests/configs/user_opts.py
# If you are a user copying-and-pasting this as an example, make sure to either
#       1) create a local `user_opts.py`, or
#       2) delete the user_opts import below and replace all appearances of `user_opts` with the literal value
#          (i.e., user_opts['swan']['username'] -> 'your_username')
from .user_opts import user_opts

config = Config(
    executors=[
        IPyParallelExecutor(
            label='midway_ipp_multinode',
            provider=SlurmProvider(
                'westmere',
                channel=SSHChannel(
                    hostname='swift.rcc.uchicago.edu',
                    username=user_opts['midway']['username'],
                    script_dir=user_opts['midway']['script_dir']
                ),
                launcher=SrunLauncher(),
                overrides=user_opts['midway']['overrides'],
                walltime="00:05:00",
                init_blocks=1,
                max_blocks=1,
                nodes_per_block=2,
                tasks_per_node=1,
            ),
            controller=Controller(public_ip=user_opts['public_ip']),
        )

    ],
    run_dir=get_rundir()
)

Open Science Grid

https://hcc-docs.unl.edu/download/attachments/11635314/Screen%20Shot%202013-03-19%20at%202.19.28%20PM.png?version=1&modificationDate=1492720049000&api=v2

The Open Science Grid (OSG) is a national, distributed computing Grid spanning over 100 individual sites to provide tens of thousands of CPU cores. The snippet below shows an example configuration for executing remotely on OSG. The configuration uses the SSHProvider to connect remotely to OSG, uses the CondorProvider to interface with the scheduler.

from parsl.executors.ipp_controller import Controller
from libsubmit.channels.ssh.ssh import SSHChannel
from libsubmit.providers.condor.condor import Condor
from parsl.config import Config
from parsl.executors.ipp import IPyParallelExecutor
from parsl.tests.utils import get_rundir

# If you are a developer running tests, make sure to update parsl/tests/configs/user_opts.py
# If you are a user copying-and-pasting this as an example, make sure to either
#       1) create a local `user_opts.py`, or
#       2) delete the user_opts import below and replace all appearances of `user_opts` with the literal value
#          (i.e., user_opts['swan']['username'] -> 'your_username')
from .user_opts import user_opts

config = Config(
    executors=[
        IPyParallelExecutor(
            label='osg_remote_ipp',
            provider=Condor(
                channel=SSHChannel(
                    hostname='login.osgconnect.net',
                    username=user_opts['osg']['username'],
                    script_dir=user_opts['osg']['script_dir']
                ),
                nodes_per_block=1,
                tasks_per_node=1,
                init_blocks=4,
                max_blocks=4,
                overrides='Requirements = OSGVO_OS_STRING == "RHEL 6" && Arch == "X86_64" &&  HAS_MODULES == True',
                worker_setup=user_opts['osg']['worker_setup'],
                walltime="01:00:00"
            ),
            controller=Controller(public_ip=user_opts['public_ip'])
        )
    ],
    run_dir=get_rundir()
)

Amazon Web Services

../_images/aws_image.png

Note

Please note that boto3 library is a requirement to use AWS with Parsl. This can be installed via python3 -m pip install libsubmit+aws

Amazon Web services is a commercial cloud service which allows you to rent a range of computers and other computing services. The snippet below shows an example configuration for provisioning nodes from the Elastic Compute Cloud (EC2) service. The first run would configure a Virtual Private Cloud and other networking and security infrastructure that will be re-used in subsequent runs. The configuration uses the AWSProvider to connect to AWS

"""Config for EC2.

Block {Min:0, init:1, Max:1}
==================
| ++++++++++++++ |
| |    Node    | |
| |            | |
| | Task  Task | |
| |            | |
| ++++++++++++++ |
==================

"""
from libsubmit.providers import AWSProvider

from parsl.config import Config
from parsl.executors.ipp import IPyParallelExecutor
from parsl.executors.ipp_controller import Controller
from parsl.tests.utils import get_rundir

# If you are a developer running tests, make sure to update parsl/tests/configs/user_opts.py
# If you are a user copying-and-pasting this as an example, make sure to either
#       1) create a local `user_opts.py`, or
#       2) delete the user_opts import below and replace all appearances of `user_opts` with the literal value
#          (i.e., user_opts['swan']['username'] -> 'your_username')
from .user_opts import user_opts

config = Config(
    executors=[
        IPyParallelExecutor(
            label='ec2_single_node',
            provider=AWSProvider(
                user_opts['ec2']['image_id'],
                region=user_opts['ec2']['region'],
                key_name=user_opts['ec2']['key_name'],
                profile="default",
                state_file='awsproviderstate.json',
                nodes_per_block=1,
                tasks_per_node=2,
                init_blocks=1,
                max_blocks=1,
                min_blocks=0,
                walltime='01:00:00',
            ),
            controller=Controller(public_ip=user_opts['public_ip']),
        )
    ],
    run_dir=get_rundir(),
)

Further help

For help constructing a configuration, you can click on class names such as Config or IPyParallelExecutor to see the associated class documentation. The same documentation can be accessed interactively at the python command line via, for example:

>>> from parsl.config import Config
>>> help(Config)