parsl.providers.SlurmProvider

class parsl.providers.SlurmProvider(partition: str | None = None, account: str | None = None, qos: str | None = None, constraint: str | None = None, clusters: str | None = None, nodes_per_block: int = 1, cores_per_node: int | None = None, mem_per_node: int | None = None, init_blocks: int = 1, min_blocks: int = 0, max_blocks: int = 1, parallelism: float = 1, walltime: str = '00:10:00', scheduler_options: str = '', regex_job_id: str = 'Submitted batch job (?P<id>\\S*)', worker_init: str = '', cmd_timeout: int = 10, exclusive: bool = True, launcher: Launcher = SingleNodeLauncher(debug=True, fail_on_any=False))[source]

Slurm Execution Provider

This provider uses sbatch to submit, sacct for status and scancel to cancel jobs. The sbatch script to be used is created from a template file in this same module.

Parameters:
  • partition (str) – Slurm partition to request blocks from. If unspecified or None, no partition slurm directive will be specified.

  • account (str) – Slurm account to which to charge resources used by the job. If unspecified or None, the job will use the user’s default account.

  • qos (str) – Slurm queue to place job in. If unspecified or None, no queue slurm directive will be specified.

  • constraint (str) – Slurm job constraint, often used to choose cpu or gpu type. If unspecified or None, no constraint slurm directive will be added.

  • clusters (str) – Slurm cluster name, or comma seperated cluster list, used to choose between different clusters in a federated Slurm instance. If unspecified or None, no slurm directive for clusters will be added.

  • nodes_per_block (int) – Nodes to provision per block.

  • cores_per_node (int) – Specify the number of cores to provision per node. If set to None, executors will assume all cores on the node are available for computation. Default is None.

  • mem_per_node (int) – Specify the real memory to provision per node in GB. If set to None, no explicit request to the scheduler will be made. Default is None.

  • init_blocks (int) – Number of blocks to provision at the start of the run. Default is 1.

  • min_blocks (int) – Minimum number of blocks to maintain.

  • max_blocks (int) – Maximum number of blocks to maintain.

  • parallelism (float) – Ratio of provisioned task slots to active tasks. A parallelism value of 1 represents aggressive scaling where as many resources as possible are used; parallelism close to 0 represents the opposite situation in which as few resources as possible (i.e., min_blocks) are used.

  • walltime (str) – Walltime requested per block in HH:MM:SS.

  • scheduler_options (str) – String to prepend to the #SBATCH blocks in the submit script to the scheduler.

  • regex_job_id (str) – The regular expression used to extract the job ID from the sbatch standard output. The default is r"Submitted batch job (?P<id>\S*)", where id is the regular expression symbolic group for the job ID.

  • worker_init (str) – Command to be run before starting a worker, such as ‘module load Anaconda; source activate env’.

  • exclusive (bool (Default = True)) – Requests nodes which are not shared with other running jobs.

  • launcher (Launcher) – Launcher for this provider. Possible launchers include SingleNodeLauncher (the default), SrunLauncher, or AprunLauncher

__init__(partition: str | None = None, account: str | None = None, qos: str | None = None, constraint: str | None = None, clusters: str | None = None, nodes_per_block: int = 1, cores_per_node: int | None = None, mem_per_node: int | None = None, init_blocks: int = 1, min_blocks: int = 0, max_blocks: int = 1, parallelism: float = 1, walltime: str = '00:10:00', scheduler_options: str = '', regex_job_id: str = 'Submitted batch job (?P<id>\\S*)', worker_init: str = '', cmd_timeout: int = 10, exclusive: bool = True, launcher: Launcher = SingleNodeLauncher(debug=True, fail_on_any=False))[source]

Methods

__init__([partition, account, qos, ...])

cancel(job_ids)

Cancels the jobs specified by a list of job ids

execute_wait(cmd[, timeout])

status(job_ids)

Get the status of a list of jobs identified by the job identifiers returned from the submit request.

submit(command, tasks_per_node[, job_name])

Submit the command as a slurm job.

Attributes

cores_per_node

Number of cores to provision per node.

label

Provides the label for this provider

mem_per_node

Real memory to provision per node in GB.

status_polling_interval

Returns the interval, in seconds, at which the status method should be called.

cancel(job_ids)[source]

Cancels the jobs specified by a list of job ids

Args: job_ids : [<job_id> …]

Returns : [True/False…] : If the cancel operation fails the entire list will be False.

property status_polling_interval[source]

Returns the interval, in seconds, at which the status method should be called.

Returns:

the number of seconds to wait between calls to status()

submit(command: str, tasks_per_node: int, job_name='parsl.slurm') str[source]

Submit the command as a slurm job.

Parameters:
  • command (str) – Command to be made on the remote side.

  • tasks_per_node (int) – Command invocations to be launched per node

  • job_name (str) – Name for the job

Returns:

job id – A string identifier for the job

Return type:

str