hpt package

hpt.binarize module

Module to binarize continuous-score predictions.

hpt.binarize.compute_binary_predictions(y_true, y_pred_scores, threshold=None, tpr=None, fpr=None, ppr=None, random_seed=42)[source]

Discretizes the given score predictions into binary labels, according to the provided target metric for thresholding.

Parameters:

y_true (np.ndarray) – The true binary labels
y_pred_scores (np.ndarray) – Predictions as a continuous score between 0 and 1
threshold (Optional[float], optional) – Whether to use a specified (global) threshold, by default None
tpr (Optional[float], optional) – Whether to target a specified TPR (true positive rate, or recall), by default None
fpr (Optional[float], optional) – Whether to target a specified FPR (false positive rate), by default None
ppr (Optional[float], optional) – Whether to target a specified PPR (positive prediction rate), by default None

Returns:

The binarized predictions according to the specified target.

Return type:

np.ndarray

hpt.evaluation module

A set of functions to evaluate predictions on common performance and fairness metrics, possibly at a specified FPR or FNR target.

hpt.evaluation.evaluate_fairness(y_true, y_pred, sensitive_attribute, return_groupwise_metrics=False)[source]

Evaluates fairness as the ratios between group-wise performance metrics.

Parameters:

y_true (np.ndarray) – The true class labels.
y_pred (np.ndarray) – The discretized predictions.
sensitive_attribute (np.ndarray) – The sensitive attribute (protected group membership).
return_groupwise_metrics (Optional[bool], optional) – Whether to return group-wise performance metrics (bool: True) or only the ratios between these metrics (bool: False), by default False.

Returns:

A dictionary with key-value pairs of (metric name, metric value).

Return type:

dict

hpt.evaluation.evaluate_performance(y_true, y_pred)[source]

Evaluates the provided predictions on common performance metrics. NOTE: currently assumes labels and predictions are binary - should we extend it to multi-class labels?

Parameters:

y_true (np.ndarray) – The true class labels.
y_pred (np.ndarray) – The discretized predictions.

Returns:

A dictionary with key-value pairs of (metric name, metric value).

Return type:

dict

hpt.evaluation.evaluate_predictions(y_true, y_pred_scores, sensitive_attribute=None, return_groupwise_metrics=False, **threshold_target)[source]

Evaluates the given predictions on both performance and fairness metrics (if sensitive_attribute is provided).

Parameters:

y_true (np.ndarray) – The true labels.
y_pred_scores (np.ndarray) – The predicted scores.
sensitive_attribute (np.ndarray, optional) – The sensitive attribute - which protected group each sample belongs to. If not provided, will not compute fairness metrics.
return_groupwise_metrics (bool) – Whether to return groupwise performance metrics (requires providing sensitive_attribute).

Returns:

A dictionary of (key, value) -> (metric_name, metric_value).

Return type:

dict

hpt.evaluation.evaluate_predictions_bootstrap(y_true, y_pred_scores, sensitive_attribute, k=200, confidence_pct=95, seed=42)[source]

Return type:: Tuple[dict, dict]

hpt.evaluation.safe_division(a, b, on_error_return=0)[source]

hpt.suggest module

Module to suggest and sample hyperparameters from distributions defined in a hyperparameter space.

hpt.suggest.suggest_callable_hyperparams(trial, hyperparameter_space, param_prefix='learner')[source]

Suggests the top-level hyperparameters for a class instantiation, or for parameterizing any other callable.

This includes the classpath/importpath, and the conditional hyperparameters to use as kwargs for the class.

Parameters:

trial (BaseTrial) – The trial object to interact with the hyperparameter sampler.
hyperparameter_space (dict) – A dictionary representing a hyperparameter space.
param_prefix (str) – The prefix to attach to all parameters’ names in order to uniquely identify them.

Returns:

The suggested callable + its suggested key-word arguments.

Return type:

Dict

hpt.suggest.suggest_hyperparams(trial, hyperparameter_space, param_prefix='')[source]

Uses the provided hyperparameter space to suggest specific configurations using the given Trial object.

Parameters:

trial (BaseTrial) – The trial object to interact with the hyperparameter sampler.
hyperparameter_space (dict) – A dictionary representing a hyperparameter space.
param_prefix (str) – The prefix to attach to all parameters’ names in order to uniquely identify them.

Return type:

An instantiation of the given hyperparameter space.

hpt.suggest.suggest_numerical_hyperparam(trial, config, param_id)[source]

Helper function to suggest a numerical hyperparameter.

Parameters:

trial (BaseTrial) – The trial object to interact with the hyperparameter sampler.
config (dict) – The distribution to sample the parameter from.
param_id (str) – The parameter’s name.

Return type:

The sampled value.

hpt.suggest.suggest_random_hyperparams(hyperparameter_space, seed)[source]

Suggests a random set of hyperparameters from the given hyperparameter space.

NOTE: this is a deterministic function of the given seed number; the seed must itself be randomly drawn for each function call if you want to get a random sample of hyperparameter configurations.

Parameters:

hyperparameter_space (Union[dict, str, Path]) – A dict or a path to a YAML file representing a hyperparameter space. If a path is provided, this function will load the hyperparameter space from the given path. Else, if a dict is provided, it will assume the hyperparameter space has already been loaded (this prevents unnecessarily re-loading the same file multiple times from disk.)
seed (int) – The random seed used to generate the random set of hyperparameters. This function is a deterministic function of the seed provided.

Returns:

A set of hyperparameters that was randomly drawn from the given hyperparameter space.

Return type:

dict

hpt.thresholding_evaluation module

Deprecated since version 0.3: This submodule has been deprecated.

Set of util functions to generate group-wise thresholds that enforce some fairness criteria (or even just to maximize global performance).

class hpt.thresholding_evaluation.ThresholdingEvaluation(y_true, s_true)[source]

Bases: object

compute_global_accuracy(groupwise_fpr, groupwise_tpr)[source]

Computes global accuracy from groupwise FPR and TPR metrics.

Parameters:

groupwise_fpr (dict) – A dictionary that maps a group to its FPR value.
groupwise_tpr (dict) – A dictionary that maps a group to its TPR value.

Returns:

The value for global accuracy, between 0.0 and 1.0.

Return type:

float

post_hoc_fairness(y_pred_scores, equal_fpr=True, equal_tpr=True, fpr_tolerance=0.0001, tpr_tolerance=0.0001, n_thresholds=100, show_progress=False, plot_rocs=False)[source]

hpt.tuner module

A simple wrapper for optuna hyperparameter tuners.

class hpt.tuner.ObjectiveFunction(X_train, y_train, X_val, y_val, *, hyperparameter_space, eval_metric, s_train=None, s_val=None, X_test=None, y_test=None, s_test=None, other_eval_metric=None, alpha=0.5, eval_func=None, return_groupwise_metrics=False, **threshold_target)[source]

Bases: object

Callable objective function to be used with optuna.

class TrialResults(id, hyperparameters, validation_results, test_results=None, train_results=None, model=None, fit_time=None, algorithm=None)[source]

Bases: object

algorithm: str = None

fit_time: float = None

hyperparameters: dict

id: int

model: BaseLearner = None

test_results: dict = None

train_results: dict = None

validation_results: dict

property all_results

property best_trial

evaluate_model(model, X, y, s=None)[source]

Return type:: dict

static fit_model(model, X_train, y_train, s_train=None, verbose=True)[source]

get_results(type_='validation')[source]

static instantiate_model(classpath, **hyperparams)[source]

Instantiates a model using the provided class path and provided hyperparameters.

Parameters:

classpath (str) – The classpath for importing the model’s constructor/class.
hyperparams (dict) – A dictionary of hyperparameter values to use as key-word arguments.

Returns:

The model object.

Return type:

object

plot(x_axis=None, y_axis=None, pyplot_show=True, data_type='validation', **kwargs)[source]

reconstruct_model(trial_results)[source]

property results

class hpt.tuner.OptunaTuner(objective_function, sampler=None, seed=42, direction='maximize', **study_kwargs)[source]

Bases: object

This class is mostly useless, just a helper for common boilerplate.

optimize(**kwargs)[source]

property results