h_threadpool.FutureAdapter¶
This is an adapter to delegate execution of the individual nodes in a Hamilton graph to a threadpool. This is useful when you have a graph with many nodes that can be executed in parallel.
- class hamilton.plugins.h_threadpool.FutureAdapter(max_workers: int = None, thread_name_prefix: str = '', result_builder: ResultBuilder = None)¶
Adapter that lazily submits each function for execution to a ThreadpoolExecutor.
This adapter has similar behavior to the async Hamilton driver which allows for parallel execution of functions.
This adapter works because we don’t have to worry about object serialization.
Caveats: - DAGs with lots of CPU intense functions will limit usefulness of this adapter, unless they release the GIL. - DAGs with lots of I/O bound work will benefit from this adapter, e.g. making API calls. - The max parallelism is limited by the number of threads in the ThreadPoolExecutor.
Unsupported behavior: - The FutureAdapter does not support DAGs with Parallelizable & Collect functions. This is due to laziness rather than anything inherently technical. If you’d like this feature, please open an issue on the Hamilton repository.
- __init__(max_workers: int = None, thread_name_prefix: str = '', result_builder: ResultBuilder = None)¶
Constructor. :param max_workers: The maximum number of threads that can be used to execute the given calls. :param thread_name_prefix: An optional name prefix to give our threads. :param result_builder: Optional. Result builder to use for building the result.
- build_result(**outputs: Any) Any ¶
Given a set of outputs, build the result.
This function will block until all futures are resolved.
- Parameters:
outputs – the outputs from the execution of the graph.
- Returns:
the result of the execution of the graph.
- do_build_result(outputs: Dict[str, Any]) Any ¶
Implements the do_build_result method from the BaseDoBuildResult class. This is kept from the user as the public-facing API is build_result, allowing us to change the API/implementation of the internal set of hooks
- do_remote_execute(*, execute_lifecycle_for_node: Callable, node: Node, **kwargs: Dict[str, Any]) Any ¶
Function that submits the passed in function to the ThreadPoolExecutor to be executed after wrapping it with the _new_fn function.
- Parameters:
node – Node that is being executed
execute_lifecycle_for_node – Function executing lifecycle_hooks and lifecycle_methods
kwargs – Keyword arguments that are being passed into the function
- input_types() List[Type[Type]] ¶
Gives the applicable types to this result builder. This is optional for backwards compatibility, but is recommended.
- Returns:
A list of types that this can apply to.
- output_type() Type ¶
Returns the output type of this result builder :return: the type that this creates