Configuration

Parsl workflows are developed completely independently from their execution environment. Parsl offers an extensible configuration model through which the execution environment and communication with that environment is configured. Parsl is configured using Config object. For more information, see the Config class documentation. The following shows how the configuration can be loaded.

import parsl
from parsl.config import Config
from parsl.executors.threads import ThreadPoolExecutor

config = Config(
    executors=[ThreadPoolExecutor()],
    lazy_errors=True
)
parsl.load(config)

Note

Please note that all configuration examples below require customization for your account, allocation, Python environment, etc.

How-to Configure

The configuration provided to Parsl dictates the shape and limits of various resources to be provisioned for the workflow. Therefore it is important to carefully evaluate certain aspects of the workflow and the planned compute resources to determine an ideal configuration match.

Here are a series of question to help formulate a suitable configuration:

  1. Where would you like the tasks that comprise the workflow to execute?
Target Executor Provider
Laptop/Workstation LocalProvider
Amazon Web Services AWSProvider
Google Cloud GoogleCloudProvider
Slurm based cluster or supercomputer SlurmProvider
Torque/PBS based cluster or supercomputer TorqueProvider
Cobalt based cluster or supercomputer CobaltProvider
GridEngine based cluster or grid GridEngineProvider
Condor based cluster or grid CondorProvider
Kubernetes cluster KubernetesProvider
  1. How many nodes do you have to execute them ? What task durations give good performance on different executors?
Executor Number of Nodes [*] Task duration for good performance
ThreadPoolExecutor 1 (Only local) Any
LowLatencyExecutor <=10 10ms+
IPyParallelExecutor <=128 50ms+
HighThroughputExecutor <=2000
Task duration(s)/#nodes >= 0.01

longer tasks needed at higher scale

ExtremeScaleExecutor >1000, <=8000 [†] >minutes
[*]We assume that each node has 32 workers. If there are fewer workers launched per node, a higher number of nodes could be supported.
[†]8000 nodes with 32 workers each totalling 256000 workers is the maximum scale that we’ve tested the ExtremeScaleExecutor at.

Warning

IPyParallelExecutor will be deprecated as of Parsl v0.8.0, with HighThroughputExecutor as the recommended replacement.

  1. If you are running on a cluster or supercomputer, will you request multiple nodes per block ? Note that in this case a block is equivalent to a batch job.
nodes_per_block = 1
Provider Executor choice Suitable Launchers
Systems that don’t use Aprun Any
Aprun based systems Any
nodes_per_block > 1
Provider Executor choice Suitable Launchers
TorqueProvider Any
CobaltProvider Any
SlurmProvider Any

Note

If you are on a Cray system, you most likely need the AprunLauncher to launch workers unless you are on a native Slurm system like Cori (NERSC)

  1. Where will you run the main parsl process vs the tasks?
Workflow location Execution target Suitable channel
Laptop/Workstation Laptop/Workstation LocalChannel
Laptop/Workstation Cloud Resources None
Laptop/Workstation Clusters with no 2FA SSHChannel
Laptop/Workstation Clusters with 2FA SSHInteractiveLoginChannel
Login node Cluster/Supercomputer LocalChannel

Comet (SDSC)

https://ucsdnews.ucsd.edu/news_uploads/comet-logo.jpg

The following snippet shows an example configuration for executing remotely on San Diego Supercomputer Center’s Comet supercomputer. The example uses an SSHChannel to connect remotely to Comet, the SlurmProvider to interface with the Slurm scheduler used by Comet and the SrunLauncher to launch workers.

from parsl.channels import SSHChannel
from parsl.providers import SlurmProvider
from parsl.launchers import SrunLauncher

from parsl.config import Config
from parsl.executors.ipp import IPyParallelExecutor
from parsl.executors.ipp_controller import Controller

# This is an example config, make sure to
#        replace the specific values below with the literal values
#          (e.g., 'USERNAME' -> 'your_username')

config = Config(
    executors=[
        IPyParallelExecutor(
            label='comet_ipp_multinode',
            provider=SlurmProvider(
                'debug',
                channel=SSHChannel(
                    hostname='comet.sdsc.xsede.org',
                    username='USERNAME',     # Please replace USERNAME with your username
                    script_dir='/home/USERNAME/parsl_scripts',    # Please replace USERNAME with your username
                ),
                launcher=SrunLauncher(),
                scheduler_options='',     # Input your scheduler_options if needed
                worker_init='',     # Input your worker_init if needed
                walltime="00:10:00",
                init_blocks=1,
                max_blocks=1,
                nodes_per_block=2,
            ),
            controller=Controller(public_ip='PUBLIC_IP'),    # Please replace PUBLIC_IP with your public ip
        )

    ],
)

Cori (NERSC)

https://6lli539m39y3hpkelqsm3c2fg-wpengine.netdna-ssl.com/wp-content/uploads/2017/08/Cori-NERSC.png

The following snippet shows an example configuration for accessing NERSC’s Cori supercomputer. This example uses the IPythonParallel executor and connects to Cori’s Slurm scheduler. It uses a remote SSH channel that allows the IPythonParallel controller to be hosted on the script’s submission machine (e.g., a PC). It is configured to request 2 nodes configured with 1 TaskBlock per node. Finally it includes override information to request a particular node type (Haswell) and to configure a specific Python environment on the worker nodes using Anaconda.

"""
    Block {Min:0, init:1, Max:1}
====================================
| ++++++++++++++ || ++++++++++++++ |
| |    Node    | || |    Node    | |
| |            | || |            | |
| | Task  Task | || | Task  Task | |
| |            | || |            | |
| ++++++++++++++ || ++++++++++++++ |
====================================
"""
from parsl.providers import SlurmProvider
from parsl.channels import SSHChannel
from parsl.launchers import SrunLauncher

from parsl.config import Config
from parsl.executors.ipp import IPyParallelExecutor
from parsl.executors.ipp_controller import Controller

# This is an example config, make sure to
#        replace the specific values below with the literal values
#          (e.g., 'USERNAME' -> 'your_username')

config = Config(
    executors=[
        IPyParallelExecutor(
            label='cori_ipp_multinode',
            provider=SlurmProvider(
                'debug',
                channel=SSHChannel(
                    hostname='cori.nersc.gov',
                    username='USERNAME',     # Please replace USERNAME with your username
                    script_dir='/global/homes/y/USERNAME/parsl_scripts',    # Please replace USERNAME with your username
                ),
                nodes_per_block=2,
                init_blocks=1,
                max_blocks=1,
                scheduler_options='',     # Input your scheduler_options if needed
                worker_init='',     # Input your worker_init if needed
                launcher=SrunLauncher(),
            ),
            controller=Controller(public_ip='PUBLIC_IP'),    # Please replace PUBLIC_IP with your public ip
        )
    ],
)

Stampede2 (TACC)

https://www.tacc.utexas.edu/documents/1084364/1413880/stampede2-0717.jpg/

The following snippet shows an example configuration for accessing TACC’s Stampede2 supercomputer. This example uses theHighThroughput executor and connects to Stampede2’s Slurm scheduler.

from parsl.config import Config

from parsl.channels import LocalChannel
from parsl.providers import SlurmProvider
from parsl.executors import HighThroughputExecutor
from parsl.addresses import address_by_hostname

from parsl.data_provider.scheme import GlobusScheme

# This is an example config, make sure to
#        replace the specific values below with the literal values
#          (e.g., 'USERNAME' -> 'your_username')


config = Config(
    executors=[
        HighThroughputExecutor(
            label="stampede2_htex",
            worker_debug=False,
            address=address_by_hostname(),
            provider=SlurmProvider(
                channel=LocalChannel(),
                nodes_per_block=2,
                init_blocks=1,
                min_blocks=1,
                max_blocks=1,
                partition='PARTITION',  # Replace with partition name
                scheduler_options='',   # Enter scheduler_options if needed
                worker_init='',         # Enter worker_init if needed
                walltime='00:30:00'
            ),
            storage_access=[GlobusScheme(
                endpoint_uuid="ceea5ca0-89a9-11e7-a97f-22000a92523b",
                endpoint_path="/",
                local_path="/"
            )]
        )

    ],
)

Theta (ALCF)

https://www.alcf.anl.gov/files/ALCF-Theta_111016-1000px.jpg

The following snippet shows an example configuration for executing on Argonne Leadership Computing Facility’s Theta supercomputer. This example uses the IPyParallelExecutor and connects to Theta’s Cobalt scheduler using the CobaltProvider. This configuration assumes that the script is being executed on the login nodes of Theta.

from parsl.providers import CobaltProvider
from parsl.launchers import AprunLauncher

from parsl.config import Config
from parsl.executors.ipp import IPyParallelExecutor
from parsl.executors.ipp_controller import Controller

# This is an example config, make sure to
#        replace the specific values below with the literal values
#          (e.g., 'USERNAME' -> 'your_username')

config = Config(
    executors=[
        IPyParallelExecutor(
            label='theta_local_ipp_multinode',
            workers_per_node=1,
            provider=CobaltProvider(
                queue="debug-flat-quad",
                launcher=AprunLauncher(),
                walltime="00:30:00",
                nodes_per_block=2,
                init_blocks=1,
                max_blocks=1,
                scheduler_options='',     # Input your scheduler_options if needed
                worker_init='',     # Input your worker_init if needed
                account='ALCF_ALLOCATION',    # Please replace ALCF_ALLOCATION with your ALCF allocation
                cmd_timeout=60
            ),
            controller=Controller(public_ip='PUBLIC_IP'),    # Please replace PUBLIC_IP with your public ip
        )

    ],

)

Cooley (ALCF)

https://today.anl.gov/wp-content/uploads/sites/44/2015/06/Cray-Cooley.jpg

The following snippet shows an example configuration for executing remotely on Argonne Leadership Computing Facility’s Cooley analysis and visualization system. The example uses an SSHInteractiveLoginChannel to connect remotely to Cooley using ALCF’s 2FA token. The configuration uses the CobaltProvider to interface with Cooley’s scheduler.

# Untested
from parsl.channels import SSHInteractiveLoginChannel
from parsl.providers import CobaltProvider
from parsl.launchers import SingleNodeLauncher

from parsl.config import Config
from parsl.executors.ipp import IPyParallelExecutor
from parsl.executors.ipp_controller import Controller

# This is an example config, make sure to
#        replace the specific values below with the literal values
#          (e.g., 'USERNAME' -> 'your_username')

config = Config(
    executors=[
        IPyParallelExecutor(
            label='cooley_ssh_il_local_single_node',
            provider=CobaltProvider(
                channel=SSHInteractiveLoginChannel(
                    hostname='cooley.alcf.anl.gov',
                    username='USERNAME',     # Please replace USERNAME with your username
                    script_dir='/home/USERNAME/parsl_scripts/',    # Please replace USERNAME with your username
                ),
                nodes_per_block=1,
                init_blocks=1,
                max_blocks=1,
                walltime="00:05:00",
                scheduler_options='',     # Input your scheduler_options if needed
                worker_init='',     # Input your worker_init if needed
                queue='pubnet-debug',
                account='ALCF_ALLOCATION',    # Please replace ALCF_ALLOCATION with your ALCF allocation
                launcher=SingleNodeLauncher(),
            ),
            controller=Controller(public_ip='PUBLIC_IP'),    # Please replace PUBLIC_IP with your public ip
        )

    ],
)

Swan (Cray)

https://www.cray.com/blog/wp-content/uploads/2016/11/XC50-feat-blog.jpg

The following snippet shows an example configuration for executing remotely on Swan, an XC50 machine hosted by the Cray Partner Network. The example uses an SSHChannel to connect remotely Swan, uses the TorqueProvider to interface with the scheduler and the AprunLauncher to launch workers on the machine

"""
    Block
====================================
| ++++++++++++++ || ++++++++++++++ |
| |    Node    | || |    Node    | |
| |            | || |            | |
| | Task  Task | || | Task  Task | |
| |            | || |            | |
| ++++++++++++++ || ++++++++++++++ |
====================================
"""
from parsl.channels import SSHChannel
from parsl.launchers import AprunLauncher
from parsl.providers import TorqueProvider

from parsl.config import Config
from parsl.executors.ipp import IPyParallelExecutor
from parsl.executors.ipp_controller import Controller


# This is an example config, make sure to
#        replace the specific values below with the literal values
#          (e.g., 'USERNAME' -> 'your_username')

config = Config(
    executors=[
        IPyParallelExecutor(
            label='swan_ipp',
            workers_per_node=2,
            provider=TorqueProvider(
                channel=SSHChannel(
                    hostname='swan.cray.com',
                    username='USERNAME',     # Please replace USERNAME with your username
                    script_dir='/home/users/USERNAME/parsl_scripts',    # Please replace USERNAME with your username
                ),
                nodes_per_block=2,
                init_blocks=1,
                max_blocks=1,
                launcher=AprunLauncher(),
                scheduler_options='',     # Input your scheduler_options if needed
                worker_init='',     # Input your worker_init if needed
            ),
            controller=Controller(public_ip='PUBLIC_IP'),    # Please replace PUBLIC_IP with your public ip
        )

    ],
)

CC-IN2P3

https://cc.in2p3.fr/wp-content/uploads/2017/03/bandeau_accueil.jpg

The snippet below shows an example configuration for executing from a login node on IN2P3’s Computing Centre. The configuration uses the LocalProvider to run on a login node primarily to avoid GSISSH, which Parsl does not support yet. This system uses Grid Engine which Parsl interfaces with using the GridEngineProvider.

"""
================== Block
| ++++++++++++++ | Node
| |            | |
| |    Task    | |             . . .
| |            | |
| ++++++++++++++ |
==================
"""
from parsl.channels import LocalChannel
from parsl.providers import GridEngineProvider
from parsl.config import Config
from parsl.executors.ipp import IPyParallelExecutor

# This is an example config, make sure to
#        replace the specific values below with the literal values
#          (e.g., 'USERNAME' -> 'your_username')

config = Config(
    executors=[
        IPyParallelExecutor(
            label='cc_in2p3_local_single_node',
            provider=GridEngineProvider(
                channel=LocalChannel(),
                nodes_per_block=1,
                init_blocks=1,
                max_blocks=1,
                walltime="00:20:00",
                scheduler_options='',     # Input your scheduler_options if needed
                worker_init='',     # Input your worker_init if needed
            ),
            engine_debug_level='DEBUG',
        )

    ],
)

Midway (RCC, UChicago)

https://rcc.uchicago.edu/sites/rcc.uchicago.edu/files/styles/slideshow-image/public/uploads/images/slideshows/20140430_RCC_8978.jpg?itok=BmRuJ-wq

This Midway cluster is a campus cluster hosted by the Research Computing Center at the University of Chicago. The snippet below shows an example configuration for executing remotely on Midway. The configuration uses the SSHChannel to connect remotely to Midway, uses the SlurmProvider to interface with the scheduler, and uses the SrunLauncher to launch workers.

from parsl.channels import SSHChannel
from parsl.providers import SlurmProvider
from parsl.launchers import SrunLauncher

from parsl.config import Config
from parsl.executors.ipp import IPyParallelExecutor
from parsl.executors.ipp_controller import Controller

# This is an example config, make sure to
#        replace the specific values below with the literal values
#          (e.g., 'USERNAME' -> 'your_username')

config = Config(
    executors=[
        IPyParallelExecutor(
            label='midway_ipp_multinode',
            provider=SlurmProvider(
                'westmere',
                channel=SSHChannel(
                    hostname='swift.rcc.uchicago.edu',
                    username='USERNAME',     # Please replace USERNAME with your username
                    script_dir='/scratch/midway2/USERNAME/parsl_scripts',    # Please replace USERNAME with your username
                ),
                launcher=SrunLauncher(),
                scheduler_options='',     # Input your scheduler_options if needed
                worker_init='',     # Input your worker_init if needed
                walltime="00:05:00",
                init_blocks=1,
                max_blocks=1,
                nodes_per_block=2,
            ),
            controller=Controller(public_ip='PUBLIC_IP'),    # Please replace PUBLIC_IP with your public ip
        )

    ],
)

Open Science Grid

https://hcc-docs.unl.edu/download/attachments/11635314/Screen%20Shot%202013-03-19%20at%202.19.28%20PM.png?version=1&modificationDate=1492720049000&api=v2

The Open Science Grid (OSG) is a national, distributed computing Grid spanning over 100 individual sites to provide tens of thousands of CPU cores. The snippet below shows an example configuration for executing remotely on OSG. The configuration uses the SSHChannel to connect remotely to OSG, uses the CondorProvider to interface with the scheduler.

from parsl.executors.ipp_controller import Controller
from parsl.channels.ssh.ssh import SSHChannel
from parsl.providers.condor.condor import Condor
from parsl.config import Config
from parsl.executors.ipp import IPyParallelExecutor

# This is an example config, make sure to
#        replace the specific values below with the literal values
#          (e.g., 'USERNAME' -> 'your_username')

config = Config(
    executors=[
        IPyParallelExecutor(
            label='osg_remote_ipp',
            provider=Condor(
                channel=SSHChannel(
                    hostname='login.osgconnect.net',
                    username='USERNAME',     # Please replace USERNAME with your username
                    script_dir='/home/USERNAME/parsl_scripts',    # Please replace USERNAME with your username
                ),
                nodes_per_block=1,
                init_blocks=4,
                max_blocks=4,
                scheduler_options='Requirements = OSGVO_OS_STRING == "RHEL 6" && Arch == "X86_64" &&  HAS_MODULES == True',
                worker_init='',     # Input your worker_init if needed
                walltime="01:00:00"
            ),
            controller=Controller(public_ip='PUBLIC_IP'),    # Please replace PUBLIC_IP with your public ip
        )
    ],
)

Amazon Web Services

../_images/aws_image.png

Note

Please note that boto3 library is a requirement to use AWS with Parsl. This can be installed via python3 -m pip install parsl[aws]

Amazon Web services is a commercial cloud service which allows you to rent a range of computers and other computing services. The snippet below shows an example configuration for provisioning nodes from the Elastic Compute Cloud (EC2) service. The first run would configure a Virtual Private Cloud and other networking and security infrastructure that will be re-used in subsequent runs. The configuration uses the AWSProvider to connect to AWS

"""Config for EC2.

Block {Min:0, init:1, Max:1}
==================
| ++++++++++++++ |
| |    Node    | |
| |            | |
| | Task  Task | |
| |            | |
| ++++++++++++++ |
==================

"""
from parsl.providers import AWSProvider

from parsl.config import Config
from parsl.executors.ipp import IPyParallelExecutor
from parsl.executors.ipp_controller import Controller

# This is an example config, make sure to
#        replace the specific values below with the literal values
#          (e.g., 'USERNAME' -> 'your_username')

config = Config(
    executors=[
        IPyParallelExecutor(
            label='ec2_single_node',
            provider=AWSProvider(
                'image_id',    # Please replace image_id with your image id, e.g., 'ami-82f4dae7'
                region='us-east-1',    # Please replace region with your region
                key_name='KEY',    # Please replace KEY with your key name
                profile="default",
                state_file='awsproviderstate.json',
                nodes_per_block=1,
                init_blocks=1,
                max_blocks=1,
                min_blocks=0,
                walltime='01:00:00',
            ),
            controller=Controller(public_ip='PUBLIC_IP'),    # Please replace PUBLIC_IP with your public ip
        )
    ],
)

Ad-Hoc Clusters

Any collection of compute nodes without a scheduler setup for task scheduling can be considered an ad-hoc cluster. Often these machines have a shared-filesystem such as NFS or Lustre. In order to use these resources with Parsl, they need to set-up for password-less SSH access.

In order to use these ssh-accessible collection of nodes as an ad-hoc cluster, we create an executor for each node, using the LocalProvider with SSHChannel to identify the node by hostname. Here’s an example configuration using the 2 login nodes from the Midway cluster as a proxy for an ad-hoc cluster.

Note

Multiple blocks should not be assigned to each node when using the HighThroughputExecutor

Further help

For help constructing a configuration, you can click on class names such as Config or IPyParallelExecutor to see the associated class documentation. The same documentation can be accessed interactively at the python command line via, for example:

>>> from parsl.config import Config
>>> help(Config)