This works on a function that outputs a dictionary, that we want to extract the fields from and make them individually available for consumption. So it expands a single function into n functions, each of which take in the output dictionary and output a specific field as named in the extract_fields decorator.

import pandas as pd
from hamilton.function_modifiers import extract_columns

    {'X_train': np.ndarray, 'X_test': np.ndarray, 'y_train': np.ndarray, 'y_test': np.ndarray})
def train_test_split_func(feature_matrix: np.ndarray,
                          target: np.ndarray,
                          test_size_fraction: float,
                          shuffle_train_test_split: bool) -> Dict[str, np.ndarray]:
    return {'X_train': ... }

The input to the decorator is a dictionary of field_name to field_type – this information is used for static compilation to ensure downstream uses are expecting the right type.

Reference Documentation

class hamilton.function_modifiers.extract_fields(fields: dict, fill_with: Any = None)#

Extracts fields from a dictionary of output.

__init__(fields: dict, fill_with: Any = None)#

Constructor for a modifier that expands a single function into the following nodes:

  • n functions, each of which take in the original dict and output a specific field

  • 1 function that outputs the original dict

  • fields – Fields to extract. A dict of ‘field_name’ -> ‘field_type’.

  • fill_with – If you want to extract a field that doesn’t exist, do you want to fill it with a default value? Or do you want to error out? Leave empty/None to error out, set fill_value to dynamically create a field value.