h_ray.RayGraphAdapter#

The graph adapter to delegate execution of the individual nodes in a Hamilton graph to Ray.

class hamilton.plugins.h_ray.RayGraphAdapter(result_builder: ResultMixin)#

Class representing what’s required to make Hamilton run on Ray.

This walks the graph and translates it to run onto Ray.

Use pip install sf-hamilton[ray] to get the dependencies required to run this.

Use this if:

  • you want to utilize multiple cores on a single machine, or you want to scale to larger data set sizes with a Ray cluster that you can connect to. Note (1): you are still constrained by machine memory size with Ray; you can’t just scale to any dataset size. Note (2): serialization costs can outweigh the benefits of parallelism so you should benchmark your code to see if it’s worth it.

Notes on scaling:#

  • Multi-core on single machine βœ…

  • Distributed computation on a Ray cluster βœ…

  • Scales to any size of data ⛔️; you are LIMITED by the memory on the instance/computer πŸ’».

Function return object types supported:#

  • Works for any python object that can be serialized by the Ray framework. βœ…

Pandas?#

  • ⛔️ Ray DOES NOT do anything special about Pandas.

CAVEATS#

  • Serialization costs can outweigh the benefits of parallelism, so you should benchmark your code to see if it’s worth it.

DISCLAIMER – this class is experimental, so signature changes are a possibility!

__init__(result_builder: ResultMixin)#

Constructor

You have the ability to pass in a ResultMixin object to the constructor to control the return type that gets produce by running on Ray.

Parameters:

result_builder – Required. An implementation of base.ResultMixin.

build_result(**outputs: Dict[str, Any]) Any#

Builds the result and brings it back to this running process.

Parameters:

outputs – the dictionary of key -> Union[ray object reference | value]

Returns:

The type of object returned by self.result_builder.

static check_input_type(node_type: Type, input_value: Any) bool#

Used to check whether the user inputs match what the execution strategy & functions can handle.

Parameters:
  • node_type – The type of the node.

  • input_value – An actual value that we want to inspect matches our expectation.

Returns:

static check_node_type_equivalence(node_type: Type, input_type: Type) bool#

Used to check whether two types are equivalent.

This is used when the function graph is being created and we’re statically type checking the annotations for compatibility.

Parameters:
  • node_type – The type of the node.

  • input_type – The type of the input that would flow into the node.

Returns:

execute_node(node: Node, kwargs: Dict[str, Any]) Any#

Function that is called as we walk the graph to determine how to execute a hamilton function.

Parameters:
  • node – the node from the graph.

  • kwargs – the arguments that should be passed to it.

Returns:

returns a ray object reference.

input_types() List[Type[Type]]#

Gives the applicable types to this result builder. This is optional for backwards compatibility, but is recommended.

Returns:

A list of types that this can apply to.

output_type() Type#

Returns the output type of this result builder :return: the type that this creates