humancompatible.glance.local_cfs package

Submodules

humancompatible.explain.glance.local_cfs.dice_method module

class humancompatible.explain.glance.local_cfs.dice_method.DiceMethod[source]

Bases: LocalCounterfactualMethod

Implementation of the Dice method for generating counterfactual instances.(https://interpret.ml/DiCE/)

The Dice method uses a specified machine learning model and data to generate counterfactual examples, providing insights into how changes in feature values can influence model predictions.

Methods:

__init__():

Initializes the DiceMethod instance.

fit(model, data, outcome_name, continuous_features, feat_to_vary, random_seed=13):

Fits the DiceMethod to the provided dataset, preparing the counterfactual generator.

explain_instances(instances, num_counterfactuals):

Generates counterfactual instances for the specified input instances.

Initializes a new instance of the DiceMethod class.

Attributes:

cf_generatorNone or dice_ml.Dice

Counterfactual generator instance, initially set to None.

explain_instances(instances: DataFrame, num_counterfactuals: int) DataFrame[source]

Generates counterfactual instances for the specified input instances.

Parameters:

instancespd.DataFrame

DataFrame containing the instances for which counterfactuals are generated.

num_counterfactualsint

The number of counterfactuals to generate for each instance.

Returns:

pd.DataFrame

A DataFrame containing the generated counterfactuals.

Raises:

ValueError

If the counterfactual generator has not been initialized (fit method not called).

fit(model, data, outcome_name, continuous_features, feat_to_vary, random_seed=13)[source]

Fits the DiceMethod to the provided dataset by creating a counterfactual generator.

Parameters:

modelobject

A machine learning model used for predictions.

datapd.DataFrame

The dataset containing features and the outcome variable.

outcome_namestr

The name of the outcome variable in the dataset.

continuous_featuresList[str]

A list of names for continuous (numerical) features.

feat_to_varyList[str]

A list of feature names that can be varied to generate counterfactuals.

random_seedint, optional

Seed for random number generation to ensure reproducibility, by default 13.

humancompatible.explain.glance.local_cfs.nearest_neighbor module

class humancompatible.explain.glance.local_cfs.nearest_neighbor.NearestNeighborMethod[source]

Bases: LocalCounterfactualMethod

NearestNeighborMethod is a local counterfactual method that finds the nearest unaffected neighbors in the training dataset to explain instances by generating counterfactuals.

This method identifies instances in the training set where the model prediction remains unaffected, and uses the nearest neighbors (based on feature similarity) to generate counterfactual explanations for new instances.

Methods:

__init__():

Initializes the NearestNeighborMethod instance.

fit(model, data, outcome_name, continuous_features, feat_to_vary, random_seed=13):

Fits the method to the training data by identifying unaffected instances based on model predictions and preparing the feature encoding for nearest neighbor searches.

explain_instances(instances, num_counterfactuals):

Finds and returns the nearest unaffected neighbors for each instance, generating the specified number of counterfactual explanations.

Initializes a new instance of the NearestNeighborMethod class.

explain_instances(instances: DataFrame, num_counterfactuals: int) DataFrame[source]

Generates counterfactual explanations for the provided instances by finding the nearest unaffected neighbors in the training data.

Parameters:

instancespd.DataFrame

DataFrame containing the instances for which counterfactual explanations are needed.

num_counterfactualsint

The number of counterfactuals to generate for each instance.

Returns:

pd.DataFrame

A DataFrame containing the nearest unaffected neighbors (counterfactuals) for each instance.

Notes:

  • If the requested number of counterfactuals exceeds the number of available unaffected instances, a warning is raised, and all unaffected instances are used.

  • Nearest neighbors are determined using a one-hot encoded feature representation.

fit(model, data: DataFrame, outcome_name: str, continuous_features: List[str], feat_to_vary: List[str], random_seed=13)[source]

Fits the NearestNeighborMethod by identifying unaffected instances in the training dataset and preparing feature encodings for counterfactual search.

Parameters:

modelobject

A machine learning model with a predict method that outputs binary predictions (0 or 1).

datapd.DataFrame

A dataset containing the features and outcome variable used for fitting the method.

outcome_namestr

The name of the outcome column in the dataset.

continuous_featuresList[str]

A list of continuous (numerical) feature column names.

feat_to_varyList[str]

A list of features allowed to vary when generating counterfactuals.

random_seedint, optional

Seed for random number generation to ensure reproducibility, by default 13.

humancompatible.explain.glance.local_cfs.random_sampling module

class humancompatible.explain.glance.local_cfs.random_sampling.RandomSampling(model, n_most_important, n_categorical_most_frequent, numerical_features, categorical_features, random_state=None)[source]

Bases: LocalCounterfactualMethod

RandomSampling is a local counterfactual method that generates counterfactual instances through random sampling based on the distribution of features in the unaffected training data.

This method identifies the most important features and the most frequent categories within the unaffected training data to generate counterfactuals by sampling from these distributions.

Methods:

__init__(model, n_most_important, n_categorical_most_frequent, numerical_features, categorical_features, random_state=None):

Initializes the RandomSampling instance with the specified parameters.

fit(X, y):

Fits the RandomSampling method to the provided training data by calculating feature importances and identifying unaffected instances.

_sample_instances(n_samples, fixed_feature_values, random_state=None):

Samples instances based on the specified feature distributions, fixing certain feature values while sampling others.

explain(instance, num_counterfactuals, n_samples=1000, random_state=None):

Generates counterfactual explanations for a given instance by sampling and modifying feature values.

explain_instances(instances, num_counterfactuals, n_samples=1000, random_state=None):

Generates counterfactuals for multiple instances by calling the explain method for each instance.

Initializes a new instance of the RandomSampling class.

Parameters:

modelobject

A machine learning model used for predictions and feature importance evaluation.

n_most_importantint

The number of most important features to consider when generating counterfactuals.

n_categorical_most_frequentint

The number of most frequent categories to consider for categorical features.

numerical_featuresList[str]

A list of continuous (numerical) feature names.

categorical_featuresList[str]

A list of categorical feature names.

random_stateint, optional

Seed for random number generation to ensure reproducibility, by default None.

explain(instance, num_counterfactuals, n_samples=1000, random_state=None)[source]

Generates counterfactual explanations for a given instance by sampling and modifying feature values.

Parameters:

instancepd.DataFrame

A single row DataFrame representing the instance for which counterfactuals are generated.

num_counterfactualsint

The number of counterfactuals to generate.

n_samplesint, optional

The number of samples to draw for generating counterfactuals, by default 1000.

random_stateint, optional

Seed for random number generation, by default None.

Returns:

pd.DataFrame

A DataFrame containing the generated counterfactuals for the provided instance.

Raises:

ValueError

If the input instance is not a single-row DataFrame or if its columns do not match the training dataset’s columns.

explain_instances(instances: DataFrame, num_counterfactuals: int, n_samples=1000, random_state=None) DataFrame[source]

Generates counterfactuals for multiple instances by calling the explain method for each instance.

Parameters:

instancespd.DataFrame

DataFrame containing instances for which counterfactual explanations are needed.

num_counterfactualsint

The number of counterfactuals to generate for each instance.

n_samplesint, optional

The number of samples to draw for generating counterfactuals, by default 1000.

random_stateint, optional

Seed for random number generation, by default None.

Returns:

pd.DataFrame

A DataFrame containing the generated counterfactuals for all provided instances.

fit(X: DataFrame, y: Series)[source]

Fits the RandomSampling method to the provided training data by calculating feature importances and identifying unaffected instances.

Parameters:

Xpd.DataFrame

The training dataset containing feature columns.

ypd.Series

The target variable corresponding to the training dataset.

Returns:

selfRandomSampling

Returns the fitted instance of RandomSampling.

Module contents