parsl.providers.TorqueProvider

class parsl.providers.TorqueProvider(account=None, queue=None, scheduler_options='', worker_init='', nodes_per_block=1, init_blocks=1, min_blocks=0, max_blocks=1, parallelism=1, launcher=AprunLauncher(debug=True, overrides=''), walltime='00:20:00', cmd_timeout=120)[source]

Torque Execution Provider

This provider uses qsub to submit, qstat for status, and qdel to cancel jobs. The qsub script to be used is created from a template file in this same module.

Parameters:
  • account (str) – Account the job will be charged against.

  • queue (str) – Torque queue to request blocks from.

  • nodes_per_block (int) – Nodes to provision per block.

  • init_blocks (int) – Number of blocks to provision at the start of the run. Default is 1.

  • min_blocks (int) – Minimum number of blocks to maintain. Default is 0.

  • max_blocks (int) – Maximum number of blocks to maintain.

  • parallelism (float) – Ratio of provisioned task slots to active tasks. A parallelism value of 1 represents aggressive scaling where as many resources as possible are used; parallelism close to 0 represents the opposite situation in which as few resources as possible (i.e., min_blocks) are used.

  • walltime (str) – Walltime requested per block in HH:MM:SS.

  • scheduler_options (str) – String to prepend to the #PBS blocks in the submit script to the scheduler. WARNING: scheduler_options should only be given #PBS strings, and should not have trailing newlines.

  • worker_init (str) – Command to be run before starting a worker, such as ‘module load Anaconda; source activate env’.

  • launcher (Launcher) – Launcher for this provider. Possible launchers include AprunLauncher (the default), or SingleNodeLauncher,

__init__(account=None, queue=None, scheduler_options='', worker_init='', nodes_per_block=1, init_blocks=1, min_blocks=0, max_blocks=1, parallelism=1, launcher=AprunLauncher(debug=True, overrides=''), walltime='00:20:00', cmd_timeout=120)[source]

Methods

__init__([account, queue, ...])

cancel(job_ids)

Cancels the jobs specified by a list of job ids

execute_wait(cmd[, timeout])

status(job_ids)

Get the status of a list of jobs identified by the job identifiers returned from the submit request.

submit(command, tasks_per_node[, job_name])

Submits the command onto an Local Resource Manager job.

Attributes

cores_per_node

Number of cores to provision per node.

label

Provides the label for this provider

mem_per_node

Real memory to provision per node in GB.

status_polling_interval

Returns the interval, in seconds, at which the status method should be called.

cancel(job_ids)[source]

Cancels the jobs specified by a list of job ids

Args: job_ids : [<job_id> …]

Returns : [True/False…] : If the cancel operation fails the entire list will be False.

property status_polling_interval[source]

Returns the interval, in seconds, at which the status method should be called.

Returns:

the number of seconds to wait between calls to status()

submit(command, tasks_per_node, job_name='parsl.torque')[source]

Submits the command onto an Local Resource Manager job. Submit returns an ID that corresponds to the task that was just submitted.

If tasks_per_node < 1 : ! This is illegal. tasks_per_node should be integer

If tasks_per_node == 1:

A single node is provisioned

If tasks_per_node > 1 :

tasks_per_node number of nodes are provisioned.

Parameters:
  • command (-) – (String) Commandline invocation to be made on the remote side.

  • tasks_per_node (-) – command invocations to be launched per node

Kwargs:
  • job_name (String): Name for job, must be unique

Returns:

At capacity, cannot provision more - job_id: (string) Identifier for the job

Return type:

  • None