parsl.dataflow.memoization.Memoizer¶
-
class
parsl.dataflow.memoization.
Memoizer
(dfk, memoize=True, checkpoint={})[source]¶ Memoizer is responsible for ensuring that identical work is not repeated.
When a task is repeated, i.e., the same function is called with the same exact arguments, the result from a previous execution is reused. wiki
The memoizer implementation here does not collapse duplicate calls at call time, but works only when the result of a previous call is available at the time the duplicate call is made.
For instance:
No advantage from Memoization helps memoization here: here: TaskA TaskB | TaskA | | | TaskA done (TaskB) | | | (TaskB) done | | done | done
The memoizer creates a lookup table by hashing the function name and its inputs, and storing the results of the function.
When a task is ready for launch, i.e., all of its arguments have resolved, we add its hash to the task datastructure.
-
__init__
(dfk, memoize=True, checkpoint={})[source]¶ Initialize the memoizer.
- Parameters
dfk (-) – The DFK object
- KWargs:
memoize (Bool): enable memoization or not.
checkpoint (Dict): A checkpoint loaded as a dict.
Methods
__init__
(dfk[, memoize, checkpoint])Initialize the memoizer.
check_memo
(task)Create a hash of the task and its inputs and check the lookup table for this hash.
hash_lookup
(hashsum)Lookup a hash in the memoization table.
make_hash
(task)Create a hash of the task inputs.
update_memo
(task, r)Updates the memoization lookup table with the result from a task.
-
check_memo
(task)[source]¶ Create a hash of the task and its inputs and check the lookup table for this hash.
If present, the results are returned. The result is a tuple indicating whether a memo exists and the result, since a None result is possible and could be confusing. This seems like a reasonable option without relying on a cache_miss exception.
- Parameters
task (-) – task from the dfk.tasks table
- Returns
A completed future containing the memoized result
- Return type
Result (Future)
This call will also set task[‘hashsum’] to the unique hashsum for the func+inputs.
-
hash_lookup
(hashsum)[source]¶ Lookup a hash in the memoization table.
- Parameters
hashsum (-) – The same hashes used to uniquely identify apps+inputs
- Returns
Lookup result
- Raises
- KeyError – if hash not in table
-