Example configurations
Note
All configuration examples below must be customized for the user’s allocation, Python environment, file system, etc.
The configuration specifies what, and how, resources are to be used for executing the Parsl program and its apps. It is important to carefully consider the needs of the Parsl program and its apps, and the characteristics of the compute resources, to determine an ideal configuration. Aspects to consider include: 1) where the Parsl apps will execute; 2) how many nodes will be used to execute the apps, and how long the apps will run; 3) should Parsl request multiple nodes in an individual scheduler job; and 4) where will the main Parsl program run and how will it communicate with the apps.
Stepping through the following question should help formulate a suitable configuration object.
Where should apps be executed?
Target |
Executor |
Provider |
---|---|---|
Laptop/Workstation |
||
Amazon Web Services |
||
Google Cloud |
||
Slurm based system |
||
Torque/PBS based system |
||
GridEngine based system |
||
Condor based cluster or grid |
||
Kubernetes cluster |
How many nodes will be used to execute the apps? What task durations are necessary to achieve good performance?
Executor |
Number of Nodes [*] |
Task duration for good performance |
---|---|---|
1 (Only local) |
Any |
|
<=2000 |
Task duration(s)/#nodes >= 0.01 longer tasks needed at higher scale |
|
<=1000 [†] |
10s+ |
|
<=1000 [‡] |
10s+ |
3. Should Parsl request multiple nodes in an individual scheduler job? (Here the term block is equivalent to a single scheduler job.)
|
||
---|---|---|
Provider |
Executor choice |
Suitable Launchers |
Systems that don’t use Aprun |
Any |
|
Aprun based systems |
Any |
|
||
---|---|---|
Provider |
Executor choice |
Suitable Launchers |
Any |
||
Any |
|
Note
If using a Cray system, you most likely need to use the parsl.launchers.AprunLauncher
to launch workers unless you
are on a native Slurm system like Perlmutter (NERSC)
Ad-Hoc Clusters
Parsl’s support of ad-hoc clusters of compute nodes without a scheduler is deprecated.
See issue #3515 for further discussion.
Amazon Web Services
Note
To use AWS with Parsl, install Parsl with AWS dependencies via python3 -m pip install 'parsl[aws]'
Amazon Web Services is a commercial cloud service which allows users to rent a range of computers and other computing services.
The following snippet shows how Parsl can be configured to provision nodes from the Elastic Compute Cloud (EC2) service.
The first time this configuration is used, Parsl will configure a Virtual Private Cloud and other networking and security infrastructure that will be
re-used in subsequent executions. The configuration uses the parsl.providers.AWSProvider
to connect to AWS.
from parsl.config import Config
from parsl.executors import HighThroughputExecutor
from parsl.providers import AWSProvider
from parsl.usage_tracking.levels import LEVEL_1
config = Config(
executors=[
HighThroughputExecutor(
label='ec2_single_node',
provider=AWSProvider(
# Specify your EC2 AMI id
'YOUR_AMI_ID',
# Specify the AWS region to provision from
# eg. us-east-1
region='YOUR_AWS_REGION',
# Specify the name of the key to allow ssh access to nodes
key_name='YOUR_KEY_NAME',
profile="default",
state_file='awsproviderstate.json',
nodes_per_block=1,
init_blocks=1,
max_blocks=1,
min_blocks=0,
walltime='01:00:00',
),
)
],
usage_tracking=LEVEL_1,
)
ASPIRE 1 (NSCC)
The following snippet shows an example configuration for accessing NSCC’s ASPIRE 1 supercomputer. This example uses the parsl.executors.HighThroughputExecutor
executor and connects to ASPIRE1’s PBSPro scheduler. It also shows how scheduler_options
parameter could be used for scheduling array jobs in PBSPro.
from parsl.addresses import address_by_interface
from parsl.config import Config
from parsl.executors import HighThroughputExecutor
from parsl.launchers import MpiRunLauncher
from parsl.monitoring.monitoring import MonitoringHub
from parsl.providers import PBSProProvider
from parsl.usage_tracking.levels import LEVEL_1
config = Config(
executors=[
HighThroughputExecutor(
label="htex",
heartbeat_period=15,
heartbeat_threshold=120,
worker_debug=True,
max_workers_per_node=4,
address=address_by_interface('ib0'),
provider=PBSProProvider(
launcher=MpiRunLauncher(),
# PBS directives (header lines): for array jobs pass '-J' option
scheduler_options='#PBS -J 1-10',
# Command to be run before starting a worker, such as:
# 'module load Anaconda; source activate parsl_env'.
worker_init='',
# number of compute nodes allocated for each block
nodes_per_block=3,
min_blocks=3,
max_blocks=5,
cpus_per_node=24,
# medium queue has a max walltime of 24 hrs
walltime='24:00:00'
),
),
],
monitoring=MonitoringHub(
hub_address=address_by_interface('ib0'),
resource_monitoring_interval=10,
),
strategy='simple',
retries=3,
app_cache=True,
checkpoint_mode='task_exit',
usage_tracking=LEVEL_1,
)
Illinois Campus Cluster (UIUC)
The following snippet shows an example configuration for executing on the Illinois Campus Cluster.
The configuration assumes the user is running on a login node and uses the parsl.providers.SlurmProvider
to interface
with the scheduler, and uses the parsl.launchers.SrunLauncher
to launch workers.
from parsl.config import Config
from parsl.executors import HighThroughputExecutor
from parsl.launchers import SrunLauncher
from parsl.providers import SlurmProvider
from parsl.usage_tracking.levels import LEVEL_1
""" This config assumes that it is used to launch parsl tasks from the login nodes
of the Campus Cluster at UIUC. Each job submitted to the scheduler will request 2 nodes for 10 minutes.
"""
config = Config(
executors=[
HighThroughputExecutor(
label="CC_htex",
worker_debug=False,
cores_per_worker=16.0, # each worker uses a full node
provider=SlurmProvider(
partition='secondary-fdr', # partition
nodes_per_block=2, # number of nodes
init_blocks=1,
max_blocks=1,
scheduler_options='',
cmd_timeout=60,
walltime='00:10:00',
launcher=SrunLauncher(),
worker_init='conda activate envParsl', # requires conda environment with parsl
),
)
],
usage_tracking=LEVEL_1,
)
Bridges (PSC)
The following snippet shows an example configuration for executing on the Bridges supercomputer at the Pittsburgh Supercomputing Center.
The configuration assumes the user is running on a login node and uses the parsl.providers.SlurmProvider
to interface
with the scheduler, and uses the parsl.launchers.SrunLauncher
to launch workers.
from parsl.addresses import address_by_interface
from parsl.config import Config
from parsl.executors import HighThroughputExecutor
from parsl.launchers import SrunLauncher
from parsl.providers import SlurmProvider
from parsl.usage_tracking.levels import LEVEL_1
""" This config assumes that it is used to launch parsl tasks from the login nodes
of Bridges at PSC. Each job submitted to the scheduler will request 2 nodes for 10 minutes.
"""
config = Config(
executors=[
HighThroughputExecutor(
label='Bridges_HTEX_multinode',
address=address_by_interface('ens3f0'),
max_workers_per_node=1,
provider=SlurmProvider(
'YOUR_PARTITION_NAME', # Specify Partition / QOS, for eg. RM-small
nodes_per_block=2,
init_blocks=1,
# string to prepend to #SBATCH blocks in the submit
# script to the scheduler eg: '#SBATCH --gres=gpu:type:n'
scheduler_options='',
# Command to be run before starting a worker, such as:
# 'module load Anaconda; source activate parsl_env'.
worker_init='',
# We request all hyperthreads on a node.
launcher=SrunLauncher(),
walltime='00:10:00',
# Slurm scheduler on Cori can be slow at times,
# increase the command timeouts
cmd_timeout=120,
),
)
],
usage_tracking=LEVEL_1,
)
CC-IN2P3
The snippet below shows an example configuration for executing from a login node on IN2P3’s Computing Centre.
The configuration uses the parsl.providers.LocalProvider
to run on a login node primarily to avoid GSISSH, which Parsl does not support.
This system uses Grid Engine which Parsl interfaces with using the parsl.providers.GridEngineProvider
.
from parsl.config import Config
from parsl.executors import HighThroughputExecutor
from parsl.providers import GridEngineProvider
from parsl.usage_tracking.levels import LEVEL_1
config = Config(
executors=[
HighThroughputExecutor(
label='cc_in2p3_htex',
max_workers_per_node=2,
provider=GridEngineProvider(
nodes_per_block=1,
init_blocks=2,
max_blocks=2,
walltime="00:20:00",
scheduler_options='', # Input your scheduler_options if needed
worker_init='', # Input your worker_init if needed
),
)
],
usage_tracking=LEVEL_1,
)
CCL (Notre Dame, TaskVine)
To utilize TaskVine with Parsl, please install the full CCTools software package within an appropriate Anaconda or Miniconda environment (instructions for installing Miniconda can be found in the Conda install guide):
$ conda create -y --name <environment> python=<version> conda-pack
$ conda activate <environment>
$ conda install -y -c conda-forge ndcctools parsl
This creates a Conda environment on your machine with all the necessary tools and setup needed to utilize TaskVine with the Parsl library.
The following snippet shows an example configuration for using the Parsl/TaskVine executor to run applications on the local machine.
This examples uses the parsl.executors.taskvine.TaskVineExecutor
to schedule tasks, and a local worker will be started automatically.
For more information on using TaskVine, including configurations for remote execution, visit the
TaskVine/Parsl documentation online.
import uuid
from parsl.config import Config
from parsl.executors.taskvine import TaskVineExecutor, TaskVineManagerConfig
from parsl.usage_tracking.levels import LEVEL_1
config = Config(
executors=[
TaskVineExecutor(
label="parsl-vine-example",
# If a project_name is given, then TaskVine will periodically
# report its status and performance back to the global TaskVine catalog,
# which can be viewed here: http://ccl.cse.nd.edu/software/taskvine/status
# To disable status reporting, comment out the project_name.
manager_config=TaskVineManagerConfig(project_name="parsl-vine-" + str(uuid.uuid4())),
)
],
usage_tracking=LEVEL_1,
)
TaskVine’s predecessor, WorkQueue, may continue to be used with Parsl. For more information on using WorkQueue visit the CCTools documentation online.
Expanse (SDSC)
The following snippet shows an example configuration for executing remotely on San Diego Supercomputer
Center’s Expanse supercomputer. The example is designed to be executed on the login nodes, using the
parsl.providers.SlurmProvider
to interface with the Slurm scheduler used by Comet and the parsl.launchers.SrunLauncher
to launch workers.
from parsl.config import Config
from parsl.executors import HighThroughputExecutor
from parsl.launchers import SrunLauncher
from parsl.providers import SlurmProvider
from parsl.usage_tracking.levels import LEVEL_1
config = Config(
executors=[
HighThroughputExecutor(
label='Expanse_CPU_Multinode',
max_workers_per_node=32,
provider=SlurmProvider(
'compute',
account='YOUR_ALLOCATION_ON_EXPANSE',
launcher=SrunLauncher(),
# string to prepend to #SBATCH blocks in the submit
# script to the scheduler
scheduler_options='',
# Command to be run before starting a worker, such as:
# 'module load Anaconda; source activate parsl_env'.
worker_init='',
walltime='01:00:00',
init_blocks=1,
max_blocks=1,
nodes_per_block=2,
),
)
],
usage_tracking=LEVEL_1,
)
Improv (Argonne LCRC)
Improv is a PBS Pro based supercomputer at Argonne’s Laboratory Computing Resource
Center (LCRC). The following snippet is an example configuration that uses parsl.providers.PBSProProvider
and parsl.launchers.MpiRunLauncher
to run on multinode jobs.
from parsl.config import Config
from parsl.executors import HighThroughputExecutor
from parsl.launchers import MpiRunLauncher
from parsl.providers import PBSProProvider
config = Config(
executors=[
HighThroughputExecutor(
label="Improv_multinode",
max_workers_per_node=32,
provider=PBSProProvider(
account="YOUR_ALLOCATION_ON_IMPROV",
# PBS directives (header lines), for example:
# scheduler_options='#PBS -l mem=4gb',
scheduler_options='',
queue="compute",
# Command to be run before starting a worker:
# **WARNING** Improv requires an openmpi module to be
# loaded for the MpiRunLauncher. Add additional env
# load commands to this multiline string.
worker_init='''
module load gcc/13.2.0;
module load openmpi/5.0.3-gcc-13.2.0; ''',
launcher=MpiRunLauncher(),
# number of compute nodes allocated for each block
nodes_per_block=2,
walltime='00:10:00'
),
),
],
)
Perlmutter (NERSC)
NERSC provides documentation on how to use Parsl on Perlmutter.
Perlmutter is a Slurm based HPC system and parsl uses parsl.providers.SlurmProvider
with parsl.launchers.SrunLauncher
to launch tasks onto this machine.
Frontera (TACC)
Deployed in June 2019, Frontera is the 5th most powerful supercomputer in the world. Frontera replaces the NSF Blue Waters system at NCSA
and is the first deployment in the National Science Foundation’s petascale computing program. The configuration below assumes that the user is
running on a login node and uses the parsl.providers.SlurmProvider
to interface with the scheduler, and uses the parsl.launchers.SrunLauncher
to launch workers.
from parsl.config import Config
from parsl.executors import HighThroughputExecutor
from parsl.launchers import SrunLauncher
from parsl.providers import SlurmProvider
from parsl.usage_tracking.levels import LEVEL_1
""" This config assumes that it is used to launch parsl tasks from the login nodes
of Frontera at TACC. Each job submitted to the scheduler will request 2 nodes for 10 minutes.
"""
config = Config(
executors=[
HighThroughputExecutor(
label="frontera_htex",
max_workers_per_node=1, # Set number of workers per node
provider=SlurmProvider(
cmd_timeout=60, # Add extra time for slow scheduler responses
nodes_per_block=2,
init_blocks=1,
min_blocks=1,
max_blocks=1,
partition='normal', # Replace with partition name
scheduler_options='#SBATCH -A <YOUR_ALLOCATION>', # Enter scheduler_options if needed
# Command to be run before starting a worker, such as:
# 'module load Anaconda; source activate parsl_env'.
worker_init='',
# Ideally we set the walltime to the longest supported walltime.
walltime='00:10:00',
launcher=SrunLauncher(),
),
)
],
usage_tracking=LEVEL_1,
)
Kubernetes Clusters
Kubernetes is an open-source system for container management, such as automating deployment and scaling of containers.
The snippet below shows an example configuration for deploying pods as workers on a Kubernetes cluster.
The KubernetesProvider exploits the Python Kubernetes API, which assumes that you have kube config in ~/.kube/config
.
from parsl.addresses import address_by_route
from parsl.config import Config
from parsl.executors import HighThroughputExecutor
from parsl.providers import KubernetesProvider
from parsl.usage_tracking.levels import LEVEL_1
config = Config(
executors=[
HighThroughputExecutor(
label='kube-htex',
cores_per_worker=1,
max_workers_per_node=1,
worker_logdir_root='YOUR_WORK_DIR',
# Address for the pod worker to connect back
address=address_by_route(),
provider=KubernetesProvider(
namespace="default",
# Docker image url to use for pods
image='YOUR_DOCKER_URL',
# Command to be run upon pod start, such as:
# 'module load Anaconda; source activate parsl_env'.
# or 'pip install parsl'
worker_init='',
# The secret key to download the image
secret="YOUR_KUBE_SECRET",
# Should follow the Kubernetes naming rules
pod_name='YOUR-POD-Name',
nodes_per_block=1,
init_blocks=1,
# Maximum number of pods to scale up
max_blocks=10,
),
),
],
usage_tracking=LEVEL_1,
)
Midway (RCC, UChicago)
This Midway cluster is a campus cluster hosted by the Research Computing Center at the University of Chicago.
The snippet below shows an example configuration for executing remotely on Midway.
The configuration assumes the user is running on a login node and uses the parsl.providers.SlurmProvider
to interface
with the scheduler, and uses the parsl.launchers.SrunLauncher
to launch workers.
from parsl.addresses import address_by_interface
from parsl.config import Config
from parsl.executors import HighThroughputExecutor
from parsl.launchers import SrunLauncher
from parsl.providers import SlurmProvider
from parsl.usage_tracking.levels import LEVEL_1
config = Config(
executors=[
HighThroughputExecutor(
label='Midway_HTEX_multinode',
address=address_by_interface('bond0'),
worker_debug=False,
max_workers_per_node=2,
provider=SlurmProvider(
'YOUR_PARTITION', # Partition name, e.g 'broadwl'
launcher=SrunLauncher(),
nodes_per_block=2,
init_blocks=1,
min_blocks=1,
max_blocks=1,
# string to prepend to #SBATCH blocks in the submit
# script to the scheduler eg: '#SBATCH --constraint=knl,quad,cache'
scheduler_options='',
# Command to be run before starting a worker, such as:
# 'module load Anaconda; source activate parsl_env'.
worker_init='',
walltime='00:30:00'
),
)
],
usage_tracking=LEVEL_1,
)
Open Science Grid
The Open Science Grid (OSG) is a national, distributed computing Grid spanning over 100 individual sites to provide tens of thousands of CPU cores.
The snippet below shows an example configuration for executing remotely on OSG. You will need to have a valid project name on the OSG.
The configuration uses the parsl.providers.CondorProvider
to interface with the scheduler.
from parsl.config import Config
from parsl.executors import HighThroughputExecutor
from parsl.providers import CondorProvider
from parsl.usage_tracking.levels import LEVEL_1
config = Config(
executors=[
HighThroughputExecutor(
label='OSG_HTEX',
max_workers_per_node=1,
provider=CondorProvider(
nodes_per_block=1,
init_blocks=4,
max_blocks=4,
# This scheduler option string ensures that the compute nodes provisioned
# will have modules
scheduler_options="""
+ProjectName = "MyProject"
Requirements = HAS_MODULES=?=TRUE
""",
# Command to be run before starting a worker, such as:
# 'module load Anaconda; source activate parsl_env'.
worker_init='''unset HOME; unset PYTHONPATH; module load python/3.7.0;
python3 -m venv parsl_env; source parsl_env/bin/activate; python3 -m pip install parsl''',
walltime="00:20:00",
),
worker_logdir_root='$OSG_WN_TMP',
worker_ports=(31000, 31001)
)
],
usage_tracking=LEVEL_1,
)
Polaris (ALCF)
ALCF provides documentation on how to use Parsl on Polaris.
Polaris uses parsl.providers.PBSProProvider
and parsl.launchers.MpiExecLauncher
to launch tasks onto the HPC system.
Stampede2 (TACC)
The following snippet shows an example configuration for accessing TACC’s Stampede2 supercomputer. This example uses theHighThroughput executor and connects to Stampede2’s Slurm scheduler.
from parsl.addresses import address_by_interface
from parsl.config import Config
from parsl.data_provider.globus import GlobusStaging
from parsl.executors import HighThroughputExecutor
from parsl.launchers import SrunLauncher
from parsl.providers import SlurmProvider
from parsl.usage_tracking.levels import LEVEL_1
config = Config(
executors=[
HighThroughputExecutor(
label='Stampede2_HTEX',
address=address_by_interface('em3'),
max_workers_per_node=2,
provider=SlurmProvider(
nodes_per_block=2,
init_blocks=1,
min_blocks=1,
max_blocks=1,
partition='YOUR_PARTITION',
# string to prepend to #SBATCH blocks in the submit
# script to the scheduler eg: '#SBATCH --constraint=knl,quad,cache'
scheduler_options='',
# Command to be run before starting a worker, such as:
# 'module load Anaconda; source activate parsl_env'.
worker_init='',
launcher=SrunLauncher(),
walltime='00:30:00'
),
storage_access=[GlobusStaging(
endpoint_uuid='ceea5ca0-89a9-11e7-a97f-22000a92523b',
endpoint_path='/',
local_path='/'
)]
)
],
usage_tracking=LEVEL_1,
)
Summit (ORNL)
The following snippet shows an example configuration for executing from the login node on Summit, the leadership class supercomputer hosted at the Oak Ridge National Laboratory.
The example uses the parsl.providers.LSFProvider
to provision compute nodes from the LSF cluster scheduler and the parsl.launchers.JsrunLauncher
to launch workers across the compute nodes.
from parsl.addresses import address_by_interface
from parsl.config import Config
from parsl.executors import HighThroughputExecutor
from parsl.launchers import JsrunLauncher
from parsl.providers import LSFProvider
from parsl.usage_tracking.levels import LEVEL_1
config = Config(
executors=[
HighThroughputExecutor(
label='Summit_HTEX',
# On Summit ensure that the working dir is writeable from the compute nodes,
# for eg. paths below /gpfs/alpine/world-shared/
working_dir='YOUR_WORKING_DIR_ON_SHARED_FS',
address=address_by_interface('ib0'), # This assumes Parsl is running on login node
worker_port_range=(50000, 55000),
provider=LSFProvider(
launcher=JsrunLauncher(),
walltime="00:10:00",
nodes_per_block=2,
init_blocks=1,
max_blocks=1,
worker_init='', # Input your worker environment initialization commands
project='YOUR_PROJECT_ALLOCATION',
cmd_timeout=60
),
)
],
usage_tracking=LEVEL_1,
)
TOSS3 (LLNL)
The following snippet shows an example configuration for executing on one of LLNL’s TOSS3
machines, such as Quartz, Ruby, Topaz, Jade, or Magma. This example uses the parsl.executors.FluxExecutor
and connects to Slurm using the parsl.providers.SlurmProvider
. This configuration assumes that the script
is being executed on the login nodes of one of the machines.
from parsl.config import Config
from parsl.executors import FluxExecutor
from parsl.launchers import SrunLauncher
from parsl.providers import SlurmProvider
from parsl.usage_tracking.levels import LEVEL_1
config = Config(
executors=[
FluxExecutor(
provider=SlurmProvider(
partition="YOUR_PARTITION", # e.g. "pbatch", "pdebug"
account="YOUR_ACCOUNT",
launcher=SrunLauncher(overrides="--mpibind=off"),
nodes_per_block=1,
init_blocks=1,
min_blocks=1,
max_blocks=1,
walltime="00:30:00",
# string to prepend to #SBATCH blocks in the submit
# script to the scheduler, e.g.: '#SBATCH -t 50'
scheduler_options='',
# Command to be run before starting a worker, such as:
# 'module load Anaconda; source activate parsl_env'.
worker_init='',
cmd_timeout=120,
),
)
],
usage_tracking=LEVEL_1,
)