Modularizing Parsl workflowsΒΆ
Parsl workflows can be developed in many ways. When developing a simple workflow it is often convenient to include the app definitions and control logic in a single script. However, as a workflow inevitably grows and changes, like any code, there are significant benefits to be obtained by modularizing the workflow, including:
- Better readability
- Logical separation of components (e.g., apps, config, and control logic)
- Ease of reuse of components
Note
Support for isolating configuration loading and app definition is available since 0.6.0. Refer: Issue#50
The following example illustrates how a Parsl project can be organized into modules.
The configuration(s) can be defined in a module or file (e.g., config.py
)
which can be imported into the control script depending on which execution resources
should be used.
import parsl
from parsl.config import Config
from parsl.executors.threads import ThreadPoolExecutor
local_threads = Config(
executors=[ThreadPoolExecutor(max_threads=4)],
lazy_errors=True
)
Parsl apps can be defined in separate file(s) or module(s) (e.g., library.py
)
grouped by functionality.
from parsl import python_app
@python_app
def increment(x):
return x + 1
Finally, the control logic for the Parsl application can then be implemented in a
separate file (e.g., run_increment.py
). This file must the import the
configuration from config.py
before calling the increment
app from
library.py
:
import parsl
from config import local_threads
from library import increment
parsl.load(local_threads)
for i in range(5):
print('{} + 1 = {}'.format(i, increment(i).result()))
Which produces the following output:
0 + 1 = 1
1 + 1 = 2
2 + 1 = 3
3 + 1 = 4
4 + 1 = 5