parsl.dataflow.strategy.Strategy

class parsl.dataflow.strategy.Strategy(dfk)[source]

FlowControl strategy.

As a workflow dag is processed by Parsl, new tasks are added and completed asynchronously. Parsl interfaces executors with execution providers to construct scalable executors to handle the variable work-load generated by the workflow. This component is responsible for periodically checking outstanding tasks and available compute capacity and trigger scaling events to match workflow needs.

Here’s a diagram of an executor. An executor consists of blocks, which are usually created by single requests to a Local Resource Manager (LRM) such as slurm, condor, torque, or even AWS API. The blocks could contain several task blocks which are separate instances on workers.

           |<--min_blocks     |<-init_blocks              max_blocks-->|
           +----------------------------------------------------------+
           |  +--------block----------+       +--------block--------+ |
executor = |  | task          task    | ...   |    task      task   | |
           |  +-----------------------+       +---------------------+ |
           +----------------------------------------------------------+
The relevant specification options are:
  1. min_blocks: Minimum number of blocks to maintain

  2. init_blocks: number of blocks to provision at initialization of workflow

  3. max_blocks: Maximum number of blocks that can be active due to one workflow

active_tasks = pending_tasks + running_tasks

Parallelism = slots / tasks
            = [0, 1] (i.e,  0 <= p <= 1)

For example:

When p = 0,

=> compute with the least resources possible. infinite tasks are stacked per slot.

blocks =  min_blocks           { if active_tasks = 0
          max(min_blocks, 1)   {  else
When p = 1,

=> compute with the most resources. one task is stacked per slot.

blocks = min ( max_blocks,
         ceil( active_tasks / slots ) )
When p = 1/2,

=> We stack upto 2 tasks per slot before we overflow and request a new block

let’s say min:init:max = 0:0:4 and task_blocks=2 Consider the following example: min_blocks = 0 init_blocks = 0 max_blocks = 4 tasks_per_node = 2 nodes_per_block = 1

In the diagram, X <- task

at 2 tasks:

+---Block---|
|           |
| X      X  |
|slot   slot|
+-----------+

at 5 tasks, we overflow as the capacity of a single block is fully used.

+---Block---|       +---Block---|
| X      X  | ----> |           |
| X      X  |       | X         |
|slot   slot|       |slot   slot|
+-----------+       +-----------+
__init__(dfk)[source]

Initialize strategy.

Methods

__init__(dfk)

Initialize strategy.

add_executors(executors)

unset_logging()

Mute newly added handlers to the root level, right after calling executor.status