h_threadpool.FutureAdapter¶

This is an adapter to delegate execution of the individual nodes in a Hamilton graph to a threadpool. This is useful when you have a graph with many nodes that can be executed in parallel.

class hamilton.plugins.h_threadpool.FutureAdapter(max_workers: int = None, thread_name_prefix: str = '', result_builder: ResultBuilder = None)¶

Adapter that lazily submits each function for execution to a ThreadpoolExecutor.

This adapter has similar behavior to the async Hamilton driver which allows for parallel execution of functions.

This adapter works because we don’t have to worry about object serialization.

Caveats: - DAGs with lots of CPU intense functions will limit usefulness of this adapter, unless they release the GIL. - DAGs with lots of I/O bound work will benefit from this adapter, e.g. making API calls. - The max parallelism is limited by the number of threads in the ThreadPoolExecutor.

Unsupported behavior: - The FutureAdapter does not support DAGs with Parallelizable & Collect functions. This is due to laziness rather than anything inherently technical. If you’d like this feature, please open an issue on the Hamilton repository.

__init__(max_workers: int = None, thread_name_prefix: str = '', result_builder: ResultBuilder = None)¶

Constructor. :param max_workers: The maximum number of threads that can be used to execute the given calls. :param thread_name_prefix: An optional name prefix to give our threads. :param result_builder: Optional. Result builder to use for building the result.

build_result(**outputs: Any) Any¶

Given a set of outputs, build the result.

This function will block until all futures are resolved.

Parameters:

outputs – the outputs from the execution of the graph.

Returns:

the result of the execution of the graph.

do_build_result(outputs: Dict[str, Any]) Any¶

Implements the do_build_result method from the BaseDoBuildResult class. This is kept from the user as the public-facing API is build_result, allowing us to change the API/implementation of the internal set of hooks

do_remote_execute(*, execute_lifecycle_for_node: Callable, node: Node, **kwargs: Dict[str, Any]) Any¶

Function that submits the passed in function to the ThreadPoolExecutor to be executed after wrapping it with the _new_fn function.

Parameters:
  • node – Node that is being executed

  • execute_lifecycle_for_node – Function executing lifecycle_hooks and lifecycle_methods

  • kwargs – Keyword arguments that are being passed into the function

input_types() List[Type[Type]]¶

Gives the applicable types to this result builder. This is optional for backwards compatibility, but is recommended.

Returns:

A list of types that this can apply to.

output_type() Type¶

Returns the output type of this result builder :return: the type that this creates