parsl.data_provider.data_manager.DataManager

class parsl.data_provider.data_manager.DataManager(dfk, max_threads=10)[source]

The DataManager is responsible for transferring input and output data.

It uses the Executor interface, where staging tasks are submitted to it, and DataFutures are returned.

__init__(dfk, max_threads=10)[source]

Initialize the DataManager.

Parameters:dfk (-) – The DataFlowKernel that this DataManager is managing data for.
Kwargs:
  • max_threads (int): Number of threads. Default is 10.
  • executors (list of Executors): Executors for which data transfer will be managed.

Methods

__init__(dfk[, max_threads]) Initialize the DataManager.
DataManager.add_file
get_data_manager() Return the DataManager of the currently loaded DataFlowKernel.
scale_in(blocks, *args, **kwargs) Scale in method.
scale_out(*args, **kwargs) Scale out method.
shutdown([block]) Shutdown the ThreadPool.
stage_in(file, executor) Transport the file from the input source to the executor.
stage_out(file, executor) Transport the file from the local filesystem to the remote Globus endpoint.
start() Start the executor.
submit(*args, **kwargs) Submit a staging app.

Attributes

run_dir Path to the run directory.
scaling_enabled Specify if scaling is enabled.