parsl.providers.TorqueProvider
- class parsl.providers.TorqueProvider(account=None, queue=None, scheduler_options='', worker_init='', nodes_per_block=1, init_blocks=1, min_blocks=0, max_blocks=1, parallelism=1, launcher=AprunLauncher(debug=True, overrides=''), walltime='00:20:00', cmd_timeout=120)[source]
Torque Execution Provider
This provider uses qsub to submit, qstat for status, and qdel to cancel jobs. The qsub script to be used is created from a template file in this same module.
- Parameters:
account (str) – Account the job will be charged against.
queue (str) – Torque queue to request blocks from.
nodes_per_block (int) – Nodes to provision per block.
init_blocks (int) – Number of blocks to provision at the start of the run. Default is 1.
min_blocks (int) – Minimum number of blocks to maintain. Default is 0.
max_blocks (int) – Maximum number of blocks to maintain.
parallelism (float) – Ratio of provisioned task slots to active tasks. A parallelism value of 1 represents aggressive scaling where as many resources as possible are used; parallelism close to 0 represents the opposite situation in which as few resources as possible (i.e., min_blocks) are used.
walltime (str) – Walltime requested per block in HH:MM:SS.
scheduler_options (str) – String to prepend to the #PBS blocks in the submit script to the scheduler. WARNING: scheduler_options should only be given #PBS strings, and should not have trailing newlines.
worker_init (str) – Command to be run before starting a worker, such as ‘module load Anaconda; source activate env’.
launcher (Launcher) – Launcher for this provider. Possible launchers include
AprunLauncher
(the default), orSingleNodeLauncher
,
- __init__(account=None, queue=None, scheduler_options='', worker_init='', nodes_per_block=1, init_blocks=1, min_blocks=0, max_blocks=1, parallelism=1, launcher=AprunLauncher(debug=True, overrides=''), walltime='00:20:00', cmd_timeout=120)[source]
Methods
__init__
([account, queue, ...])cancel
(job_ids)Cancels the jobs specified by a list of job ids
execute_wait
(cmd[, timeout])status
(job_ids)Get the status of a list of jobs identified by the job identifiers returned from the submit request.
submit
(command, tasks_per_node[, job_name])Submits the command onto an Local Resource Manager job.
Attributes
cores_per_node
Number of cores to provision per node.
label
Provides the label for this provider
mem_per_node
Real memory to provision per node in GB.
Returns the interval, in seconds, at which the status method should be called.
- cancel(job_ids)[source]
Cancels the jobs specified by a list of job ids
Args: job_ids : [<job_id> …]
Returns : [True/False…] : If the cancel operation fails the entire list will be False.
- property status_polling_interval[source]
Returns the interval, in seconds, at which the status method should be called.
- Returns:
the number of seconds to wait between calls to status()
- submit(command, tasks_per_node, job_name='parsl.torque')[source]
Submits the command onto an Local Resource Manager job. Submit returns an ID that corresponds to the task that was just submitted.
If tasks_per_node < 1 : ! This is illegal. tasks_per_node should be integer
- If tasks_per_node == 1:
A single node is provisioned
- If tasks_per_node > 1 :
tasks_per_node number of nodes are provisioned.
- Parameters:
command (-) – (String) Commandline invocation to be made on the remote side.
tasks_per_node (-) – command invocations to be launched per node
- Kwargs:
job_name (String): Name for job, must be unique
- Returns:
At capacity, cannot provision more - job_id: (string) Identifier for the job
- Return type:
None