parsl.dataflow.dflow.DataFlowKernel

class parsl.dataflow.dflow.DataFlowKernel(config: Config)[source]

The DataFlowKernel adds dependency awareness to an existing executor.

It is responsible for managing futures, such that when dependencies are resolved, pending tasks move to the runnable state.

Here is a simplified diagram of what happens internally:

 User             |        DFK         |    Executor
----------------------------------------------------------
                  |                    |
       Task-------+> +Submit           |
     App_Fu<------+--|                 |
                  |  Dependencies met  |
                  |         task-------+--> +Submit
                  |        Ex_Fu<------+----|

__init__(config: Config) → None[source]

Initialize the DataFlowKernel.

Parameters:: config (Config) – A specification of all configuration options. For more details see the :class:~`parsl.config.Config` documentation.

Methods

`__init__`(config)	Initialize the DataFlowKernel.
`add_executors`(executors)
`atexit_cleanup`()
`check_staging_inhibited`(kwargs)
`cleanup`()	Clean-up by closing all of the components used by the DFK
`default_std_autopath`(taskrecord, kw)
`handle_exec_update`(task_record, future)	This function is called only as a callback from an execution attempt reaching a final state (either successfully or failing).
`handle_join_update`(task_record, inner_app_future)
`launch_if_ready`(task_record)	Schedules a task record for re-inspection to see if it is ready for launch and for launch if it is ready.
`launch_task`(task_record)	Handle the actual submission of the task to the executor layer.
`log_task_states`()
`render_future_description`(dep)	Renders a description of the future in the context of the current DFK.
`submit`(func, app_args, executors, cache, ...)	Add task to the dataflow system.
`wait_for_current_tasks`()	Waits for all tasks in the task list to be completed, by waiting for their AppFuture to be completed.
`wipe_task`(task_id)	Remove task with task_id from the internal tasks table

Attributes

config

Returns the fully initialized config that the DFK is actively using.

add_executors(executors: Sequence[ParslExecutor]) → None[source]

atexit_cleanup() → None[source]

static check_staging_inhibited(kwargs: Dict[str, Any]) → bool[source]

cleanup() → None[source]: Clean-up by closing all of the components used by the DFK

property config: Config[source]

Returns the fully initialized config that the DFK is actively using.

Returns:

Config object

default_std_autopath(taskrecord, kw)[source]

handle_exec_update(task_record: TaskRecord, future: Future) → None[source]

This function is called only as a callback from an execution attempt reaching a final state (either successfully or failing).

It will launch retries if necessary, and update the task structure.

Parameters:

task_record (dict) – Task record
future (Future) – The future object corresponding to the task which
callback (makes this)

handle_join_update(task_record: TaskRecord, inner_app_future: AppFuture | None) → None[source]

launch_if_ready(task_record: TaskRecord) → None[source]

Schedules a task record for re-inspection to see if it is ready for launch and for launch if it is ready. The call will return immediately.

This should be called by any piece of the DataFlowKernel that thinks a task may have become ready to run.

It is not an error to call launch_if_ready on a task that is not ready to run - launch_if_ready will not incorrectly launch that task.

launch_if_ready is thread safe, so may be called from any thread or callback.

launch_task(task_record: TaskRecord) → Future[source]

Handle the actual submission of the task to the executor layer.

Parameters:: task_record – The task record
Returns:: Future that tracks the execution of the submitted function

log_task_states() → None[source]

render_future_description(dep: Future) → str[source]: Renders a description of the future in the context of the current DFK.

submit(func: Callable, app_args: Sequence[Any], executors: str | Sequence[str], cache: bool, ignore_for_cache: Sequence[str] | None, app_kwargs: Dict[str, Any], join: bool = False) → AppFuture[source]

Add task to the dataflow system.

If the app task has the executors attributes not set (default==’all’) the task will be launched on a randomly selected executor from the list of executors. If the app task specifies a particular set of executors, it will be targeted at the specified executors.

Parameters:: func (-) – A function object

KWargs :

app_args : Args to the function
executors (list or string)List of executors this call could go to.
Default=’all’
cache (Bool) : To enable memoization or not
ignore_for_cache (sequence) : List of kwargs to be ignored for memoization/checkpointing
app_kwargs (dict) : Rest of the kwargs to the fn passed as dict.

Returns:: AppFuture

wait_for_current_tasks() → None[source]: Waits for all tasks in the task list to be completed, by waiting for their AppFuture to be completed. This method will not necessarily wait for any tasks added after cleanup has started (such as data stageout?)

wipe_task(task_id: int) → None[source]: Remove task with task_id from the internal tasks table