FACTS: Fairness Aware Counterfactual for Subgroups package

FACTS

These are the main entry points to using the FACTS framework.

class humancompatible.explain.facts.__init__.FACTS(clf, prot_attr, categorical_features=None, freq_itemset_min_supp=0.1, feature_weights={}, feats_allowed_to_change=None, feats_not_allowed_to_change=None)[source]

Bases: BaseEstimator

Fairness aware counterfactuals for subgroups (FACTS) detector.

FACTS is an efficient, model-agnostic, highly parameterizable, and explainable framework for evaluating subgroup fairness through counterfactual explanations [#FACTS23]_.

This class is a wrapper for the various methods exposed by the FACTS framework.

References

Parameters:
  • clf (sklearn.base.BaseEstimator) – A trained and ready to use classifier, implementing method predict(X), where X is the matrix of features; predictions returned by predict(X) are either 0 or 1. In other words, fitted scikit-learn classifiers.

  • prot_attr (str) – the name of the column that represents the protected attribute.

  • categorical_features (list(str), optional) – the list of categorical features. The default is to choose (dynamically, inside fit) the columns of the dataset with types “object” or “category”.

  • freq_itemset_min_supp (float, optional) – minimum support for all the runs of the frequent itemset mining algorithm (specifically, FP Growth). We mine frequent itemsets to generate candidate subpopulation groups and candidate actions. For more information, see paper [#FACTS23]_. Defaults to 10%.

  • feature_weights (dict(str, float), optional) – the weights for each feature. Used in the calculation of the cost of a suggested change. Specifically, the term corresponding to each feature is multiplied by this weight. Defaults to 1, for all features.

  • feats_allowed_to_change (list(str), optional) – if provided, only allows these features to change value in the suggested recourses. Default: no frozen features. Note: providing both feats_allowed_to_change and feats_not_allowed_to_change is currently treated as an error.

  • feats_not_allowed_to_change (list(str), optional) – if provided, prevents these features from changing at all in any given recourse. Default: no frozen features. Note: providing both feats_allowed_to_change and feats_not_allowed_to_change is currently treated as an error.

bias_scan(metric: str = 'equal-effectiveness', viewpoint: str = 'macro', sort_strategy: str = 'max-cost-diff-decr', top_count: int = 10, filter_sequence: List[str] = [], phi: float = 0.5, c: float = 0.5)[source]

Examines generated subgroups and calculates the top_count most unfair ones, with respect to the chosen metric.

Stores the final groups in instance variable self.top_rules and the respective subgroup costs in self.subgroup_costs (or self.unfairness for the “fair-tradeoff” metric).

Parameters:
  • metric (str, optional) –

    one of the following choices

    • ”equal-effectiveness”

    • ”equal-choice-for-recourse”

    • ”equal-effectiveness-within-budget”

    • ”equal-cost-of-effectiveness”

    • ”equal-mean-recourse”

    • ”fair-tradeoff”

    Defaults to “equal-effectiveness”.

    For explanation of each of those metrics, refer either to the paper [#FACTS23]_ or the demo_FACTS notebook.

  • viewpoint (str, optional) –

    “macro” or “micro”. Refers to the notions of “macro viewpoint” and “micro viewpoint” defined in section 2.2 of the paper [#FACTS23]_.

    As a short explanation, consider a set of actions A and a subgroup (cohort / set of individuals) G. Metrics with the macro viewpoint interpretation are constrained to always apply one action from A to the entire G, while metrics with the micro interpretation are allowed to give each individual in G the min-cost action from A which changes the individual’s class.

    Note that not all combinations of metric and viewpoint are valid, e.g. “Equal Choice for Recourse” only has a macro interpretation.

    Defaults to “macro”.

  • sort_strategy (str, optional) –

    one of the following choices

    • ”max-cost-diff-decr”: simply rank the groups in descending order according to the unfairness metric.

    • ”max-cost-diff-decr-ignore-forall-subgroups-empty”: ignore groups for which we have no available actions whatsoever.

    • ”max-cost-diff-decr-ignore-exists-subgroup-empty”: ignore groups for which at least one protected subgroup has no available actions.

    Defaults to “max-cost-diff-decr”.

  • top_count (int, optional) – the number of subpopulation groups that the algorithm will keep. Defaults to 10.

  • filter_sequence (List[str], optional) –

    List of various filters applied on the groups and / or actions. Available filters are:

    • ”remove-contained”: does not show groups which are subsumed by other shown groups. By “subsumed” we mean that the group is defined by extra feature values, but those values are not changed by any action.

    • ”remove-below-thr-corr”: does not show actions which are below the given effectiveness threshold. Refer also to the documentation of parameter phi below.

    • ”remove-above-thr-cost”: does not show action that cost more than the given cost budget. Refer also to the documentation of parameter c below.

    • ”keep-rules-until-thr-corr-reached”:

    • ”remove-fair-rules”: do not show groups which do not exhibit bias.

    • ”keep-only-min-change”: for each group shown, show only the suggested actions that have minimum cost, ignore the others.

    Defaults to [].

  • phi (float, optional) – effectiveness threshold. Real number in [0, 1]. Applicable for “equal-choice-for-recourse” and “equal-cost-of-effectiveness” metrics. For these two metrics, an action is considered to achieve recourse for a subpopulation group if at least phi % of the group’s individuals achieve recourse. Defaults to 0.5.

  • c (float, optional) – cost budget. Real number. Applicable for “equal-effectiveness-within-budget” metric. Specifies the maximum cost that can be payed for an action (by the individual, by a central authority etc.) Defaults to 0.5.

fit(X: DataFrame, verbose: bool = True)[source]

Calculates subpopulation groups, actions and respective effectiveness

Parameters:
  • X (DataFrame) – Dataset given as a pandas.DataFrame. As in standard scikit-learn convention, it is expected to contain one instance per row and one feature / explanatory variable per column (labels not needed, we already have an ML model).

  • verbose (bool) – whether to print intermediate messages and progress bar. Defaults to True.

Raises:
  • ValueErrorfeats_allowed_to_change and feats_not_allowed_to_change cannot be given simultaneously.

  • Exception – when unreachable code is executed.

Returns:

Returns self.

Return type:

FACTS

print_recourse_report(population_sizes=None, missing_subgroup_val='N/A', show_subgroup_costs=False, show_action_costs=False, show_cumulative_plots=False, show_bias=None, show_unbiased_subgroups=True, correctness_metric=False)[source]

Prints a nicely formatted report of the results (subpopulation groups and recourses) discovered by the bias_scan method.

Parameters:
  • population_sizes (dict(str, int), optional) – Number of individuals that are given the negative prediction by the model, for each subgroup. If given, it is included in the report together with some coverage percentages.

  • missing_subgroup_val (str, optional) – Optionally specify a value of the protected attribute which denotes that it is missing and should not be included in the printed results. Defaults to “N/A”.

  • show_subgroup_costs (bool, optional) – Whether to show the costs assigned to each protected subgroup. Defaults to False.

  • show_action_costs (bool, optional) – Whether to show the costs assigned to each specific action. Defaults to False.

  • show_cumulative_plots (bool, optional) – If true, shows, for each subgroup, a graph of the effectiveness cumulative distribution, as it is called in [#FACTS23]_. Defaults to False.

  • show_bias (str, optional) – Specify which value of the protected attribute corresponds to the subgroup against which we want to find unfairness. Mainly useful for when the protected attribute is not binary (e.g. race). Defaults to None.

  • correctness_metric (bool, optional) – if True, the metric is considered to quantify utility, i.e. the greater it is for a group, the more beneficial it is for the individuals of the group. Defaults to False.

  • metric_name (str, optional) – If given, it is added to the the printed message for unfairness in a subpopulation group, i.e. the method prints “Bias against females due to <metric_name>”.

Raises:

RuntimeError – if costs for groups and subgroups are empty. Most likely the bias_scan method was not run.

humancompatible.explain.facts.__init__.FACTS_bias_scan(X: DataFrame, clf: BaseEstimator, prot_attr: str, metric: str, categorical_features: List[str] | None = None, freq_itemset_min_supp: float = 0.1, feature_weights: Dict[str, float] = {}, feats_allowed_to_change: List[str] | None = None, feats_not_allowed_to_change: List[str] | None = None, viewpoint: str = 'macro', sort_strategy: str = 'max-cost-diff-decr', top_count: int = 1, phi: float = 0.5, c: float = 0.5, verbose: bool = True, print_recourse_report: bool = False, show_subgroup_costs: bool = False, show_action_costs: bool = False, is_correctness_metric: bool = False)[source]

Identify the subgroups with the most difficulty achieving recourse.

FACTS is an efficient, model-agnostic, highly parameterizable, and explainable framework for evaluating subgroup fairness through counterfactual explanations [#FACTS23]_.

Note

This function is a wrapper to run the FACTS framework from start to finish. Its purpose is to provide an API which is both closer to the detectors API and more succinct.

For more options and greater control (including the option to cache some intermediate results and then apply more than one metric fast), consider using the FACTS class.

References

Parameters:
  • X (DataFrame) – Dataset given as a pandas.DataFrame. As in standard scikit-learn convention, it is expected to contain one instance per row and one feature / explanatory variable per column (labels not needed, we already have an ML model).

  • clf (sklearn.base.BaseEstimator) – A trained and ready to use classifier, implementing method predict(X), where X is the matrix of features; predictions returned by predict(X) are either 0 or 1. In other words, fitted scikit-learn classifiers.

  • prot_attr (str) – the name of the column that represents the protected attribute.

  • metric (str, optional) –

    one of the following choices

    • ”equal-effectiveness”

    • ”equal-choice-for-recourse”

    • ”equal-effectiveness-within-budget”

    • ”equal-cost-of-effectiveness”

    • ”equal-mean-recourse”

    • ”fair-tradeoff”

    Defaults to “equal-effectiveness”.

    For explanation of each of those metrics, refer either to the paper [#FACTS23]_ or the demo_FACTS notebook.

  • categorical_features (list(str), optional) – the list of categorical features. The default is to choose (dynamically, inside fit) the columns of the dataset with types “object” or “category”.

  • freq_itemset_min_supp (float, optional) –

    minimum support for all the runs of the frequent itemset mining algorithm (specifically, FP Growth). We mine frequent itemsets to generate candidate subpopulation groups and candidate actions. For more information, see paper [#FACTS23]_. Defaults to 10%.

  • feature_weights (dict(str, float), optional) – the weights for each feature. Used in the calculation of the cost of a suggested change. Specifically, the term corresponding to each feature is multiplied by this weight. Defaults to 1, for all features.

  • feats_allowed_to_change (list(str), optional) – if provided, only allows these features to change value in the suggested recourses. Default: no frozen features. Note: providing both feats_allowed_to_change and feats_not_allowed_to_change is currently treated as an error.

  • feats_not_allowed_to_change (list(str), optional) – if provided, prevents these features from changing at all in any given recourse. Default: no frozen features. Note: providing both feats_allowed_to_change and feats_not_allowed_to_change is currently treated as an error.

  • viewpoint (str, optional) –

    “macro” or “micro”. Refers to the notions of “macro viewpoint” and “micro viewpoint” defined in section 2.2 of the paper [#FACTS23]_.

    As a short explanation, consider a set of actions A and a subgroup (cohort / set of individuals) G. Metrics with the macro viewpoint interpretation are constrained to always apply one action from A to the entire G, while metrics with the micro interpretation are allowed to give each individual in G the min-cost action from A which changes the individual’s class.

    Note that not all combinations of metric and viewpoint are valid, e.g. “Equal Choice for Recourse” only has a macro interpretation.

    Defaults to “macro”.

  • sort_strategy (str, optional) –

    one of the following choices

    • ”max-cost-diff-decr”: simply rank the groups in descending order according to the unfairness metric.

    • ”max-cost-diff-decr-ignore-forall-subgroups-empty”: ignore groups for which we have no available actions whatsoever.

    • ”max-cost-diff-decr-ignore-exists-subgroup-empty”: ignore groups for which at least one protected subgroup has no available actions.

    Defaults to “max-cost-diff-decr”.

  • top_count (int, optional) – the number of subpopulation groups that the algorithm will keep. Defaults to 1, i.e. returns the most biased group.

  • phi (float, optional) – effectiveness threshold. Real number in [0, 1]. Applicable for “equal-choice-for-recourse” and “equal-cost-of-effectiveness” metrics. For these two metrics, an action is considered to achieve recourse for a subpopulation group if at least phi % of the group’s individuals achieve recourse. Defaults to 0.5.

  • c (float, optional) – cost budget. Real number. Applicable for “equal-effectiveness-within-budget” metric. Specifies the maximum cost that can be payed for an action (by the individual, by a central authority etc.) Defaults to 0.5.

  • verbose (bool, optional) – whether to print intermediate messages and progress bar. Defaults to True.

  • print_recourse_report (bool, optional) – whether to print a detailed and annotated report of the most biased groups to stdout. If False, the most biased groups are only computed and returned. Defaults to False.

  • show_subgroup_costs (bool, optional) – Whether to show the costs assigned to each protected subgroup. Defaults to False.

  • show_action_costs (bool, optional) – Whether to show the costs assigned to each specific action. Defaults to False.

  • is_correctness_metric (bool, optional) – if True, the metric is considered to quantify utility, i.e. the greater it is for a group, the more beneficial it is for the individuals of the group. Defaults to False.

Returns:

the most biased groups as a list of pairs. In each pair, the first element is the group description as a dict. The second element is the value of the chosen unfairness metric for this group.

Return type:

list(tuple(dict(str, str), float))

humancompatible.explain.facts.__init__.print_recourse_report(rules: Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]], population_sizes: Dict[str, int] | None = None, missing_subgroup_val: str = 'N/A', subgroup_costs: Dict[Predicate, Dict[str, float]] | None = None, show_subgroup_costs: bool = False, show_then_costs: bool = False, show_cumulative_plots: bool = False, show_bias: str | None = None, correctness_metric: bool = False, metric_name: str | None = None) None[source]

Prints a report detailing the recourses and fairness assessment for a given set of rules. In our work, we generally refer to this representation as “Comparative Subgroup Counterfactuals”.

Parameters:
  • rules (Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]]) – The collection of rules with associated recourses, correctness, and costs.

  • population_sizes (Optional[Dict[str, int]], optional) – A dictionary specifying the population sizes for the protected and unprotected populations. If provided, it is added to the coverage statistics included in the report. Defaults to None.

  • missing_subgroup_val (str, optional) – In the case of records with missing values for the protected attribute (e.g. sex), this parameter determines whether the “group” of individuals with unknown protected attribute will be shown in the report or not. In other words, the group with this name is excluded altogether from the report. Defaults to “N/A”.

  • subgroup_costs (Optional[Dict[Predicate, Dict[str, float]]], optional) – A dictionary specifying the aggregate costs for each subgroup in each rule. If provided, the costs of recourses will be included in the report. Defaults to None.

  • show_subgroup_costs (bool, optional) – Indicates whether to display the subgroup costs in the report. Only applicable when subgroup_costs is provided. Defaults to False.

  • show_then_costs (bool, optional) – Indicates whether to display the counterfactual costs of recourses in the report. Defaults to False.

  • show_cumulative_plots (bool, optional) – Indicates whether to display the cumulative effectiveness plots in the report. Defaults to False.

  • show_bias (Optional[str], optional) – Specifies the biased subgroup to highlight in the report. Only applicable when subgroup_costs is provided. Defaults to None.

  • correctness_metric (bool, optional) – Indicates whether the metric to use for assessing bias is correctness or cost metric. If False, it is a cost metric and the group with maximum value is the biased one. If True, it is a correctness metric and the group with minimum value is the biased one. Only applicable when subgroup_costs is provided. Defaults to False.

  • metric_name (str, optional) – The name of the fairness metric used to assess bias. Only applicable when subgroup_costs is provided. Defaults to “Equal Effectiveness”.

Returns:

None

Rule Mining & Filtering

humancompatible.explain.facts.frequent_itemsets.fpgrowth_out_to_predicate_list(fpgrowth_out: DataFrame) Tuple[List[Predicate], List[float]][source]
Converts the output of the FP Growth algorithm stored in a DataFrame to a

list of Predicate objects and their corresponding support values.

Parameters:

fpgrowth_out (DataFrame) – The DataFrame containing the output of the FP Growth algorithm.

Returns:

A tuple containing the list of

Predicate objects and the list of corresponding support values.

Return type:

Tuple[List[Predicate], List[float]]

Raises:

None

Examples

>>> freq_itemsets = run_fpgrowth(preprocessDataset(df), min_support=0.03)
>>> predicate_list = fpgrowth_out_to_predicate_list(freq_itemsets)
humancompatible.explain.facts.frequent_itemsets.preprocessDataset(data: DataFrame) DataFrame[source]
Preprocesses the input DataFrame by converting categorical columns to

NumPy arrays and mapping each cell value with its column name.

Parameters:

data (DataFrame) – The input DataFrame to be preprocessed.

Returns:

The preprocessed DataFrame.

Return type:

DataFrame

Raises:

None

humancompatible.explain.facts.frequent_itemsets.run_fpgrowth(data: DataFrame, min_support: float = 0.001) DataFrame[source]
Runs the FP Growth algorithm on the input DataFrame to find frequent

itemsets.

Parameters:
  • data (DataFrame) – The input DataFrame.

  • min_support (float, optional) – The minimum support threshold for itemsets. Defaults to 0.001, i.e 0.1%.

Returns:

The DataFrame containing frequent itemsets sorted by

support in descending order.

Return type:

DataFrame

Raises:

None

Examples

>>> freq_itemsets = runFPGrowth(preprocessDataset(df), min_support=0.03)
class humancompatible.explain.facts.predicate.Predicate(features: ~typing.List[str] = <factory>, values: ~typing.List[~typing.Any] = <factory>)[source]

Bases: object

Represents a predicate with features and values.

contains(other: object) bool[source]

Checks if the predicate contains another predicate.

Parameters:

other – The predicate to check for containment.

Returns:

True if the predicate contains the other predicate, False otherwise.

features: List[str]
static from_dict(d: Dict[str, str]) Predicate[source]

Creates a Predicate instance from a dictionary.

Parameters:

d – A dictionary representing the predicate.

Returns:

A Predicate instance.

satisfies(x: Mapping[str, Any]) bool[source]

Checks if the predicate is satisfied by a given input.

Parameters:

x – The input to be checked against the predicate.

Returns:

True if the predicate is satisfied, False otherwise.

satisfies_v(X: DataFrame) Series[source]

Vectorized version of the satisfies method.

Parameters:

X (DataFrame) – a dataframe of instances (rows)

Returns:

boolean Series with value True if an instance satisfies the predicate and False otherwise

Return type:

pd.Series

to_dict() Dict[str, str][source]

Converts the predicate to a dictionary representation.

Returns:

A dictionary representing the predicate.

values: List[Any]
width()[source]

Returns the number of features in the predicate.

humancompatible.explain.facts.predicate.drop_two_above(p1: Predicate, p2: Predicate, l: list) bool[source]

Checks if the values of the given predicates are within a difference of two based on the provided conditions.

Parameters:
  • p1 – The first Predicate.

  • p2 – The second Predicate.

  • l – The list of values for comparison.

Returns:

True if the values are within a difference of two, False otherwise.

humancompatible.explain.facts.predicate.featureChangePred(p1: ~humancompatible.explain.facts.predicate.Predicate, p2: ~humancompatible.explain.facts.predicate.Predicate, params: ~humancompatible.explain.facts.parameters.ParameterProxy = ParameterProxy(featureChanges=defaultdict(<function make_default_featureChanges.<locals>.<lambda>>, {})))[source]

Calculates the feature change between two predicates.

Parameters:
  • p1 – The first Predicate.

  • p2 – The second Predicate.

  • params – The ParameterProxy object containing feature change functions.

Returns:

The feature change between the two predicates.

humancompatible.explain.facts.predicate.recIsValid(p1: Predicate, p2: Predicate, X: DataFrame, drop_infeasible: bool, feats_not_allowed_to_change: List[str] = []) bool[source]

Checks if the given pair of predicates is valid based on the provided conditions.

Parameters:
  • p1 – The first Predicate.

  • p2 – The second Predicate.

  • X – The DataFrame containing the data.

  • drop_infeasible – Flag indicating whether to drop infeasible cases.

Returns:

True if the pair of predicates is valid, False otherwise.

humancompatible.explain.facts.rule_filters.delete_fair_rules(rulesbyif: Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]], subgroup_costs: Dict[Predicate, Dict[str, float]]) Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]][source]

Deletes fair rules from the given set of rules based on subgroup costs.

Parameters:
  • rulesbyif (Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float]]]]]) – Dictionary mapping predicates to a dictionary of group IDs and associated cost and correctness tuples.

  • subgroup_costs (Dict[Predicate, Dict[str, float]]) – Dictionary mapping predicates to a dictionary of group IDs and their corresponding subgroup costs.

Returns:

Dictionary containing the remaining rules after deleting the fair rules.

Return type:

Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float]]]]]

humancompatible.explain.facts.rule_filters.filter_contained_rules_keep_max_bias(rulesbyif: Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]], subgroup_costs: Dict[Predicate, Dict[str, float]]) Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]][source]
humancompatible.explain.facts.rule_filters.filter_contained_rules_simple(rulesbyif: Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]]) Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]][source]

Filters the rules to remove the contained rules based on simple containment criteria.

Parameters:

rulesbyif (Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float]]]]]) – Dictionary mapping predicates to a dictionary of group IDs and associated cost and correctness tuples.

Returns:

Filtered rules after removing the contained rules based on simple containment criteria.

Return type:

Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float]]]]]

humancompatible.explain.facts.rule_filters.keep_only_minimum_change(rulesbyif: Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]]) Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]][source]

Filters rules based on minimum change.

Parameters:
  • rulesbyif (Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float]]]]]) – Dictionary mapping predicates to a dictionary of group IDs and associated cost and correctness tuples.

  • params (ParameterProxy, optional) – Parameter proxy object containing parameter values for calculating the feature change. Defaults to ParameterProxy().

Returns:

Dictionary containing the filtered rules based on the minimum change criterion.

Return type:

Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float]]]]]

humancompatible.explain.facts.rule_filters.keep_rules_until_correctness_threshold_reached(rulesbyif: Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]], threshold: float = 0.5) Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]][source]

Filters rules based on cumulative correctness threshold.

Parameters:
  • rulesbyif (Dict[ Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]] ]) – Dictionary mapping predicates to a dictionary of group IDs and associated cost, correctness, and cumulative cost tuples.

  • threshold (float, optional) – Threshold value for the cumulative correctness. Rules with a cumulative correctness value greater than the threshold are kept. Defaults to 0.5.

Returns:

Dictionary containing the filtered rules based on the cumulative correctness threshold.

Return type:

Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]]

humancompatible.explain.facts.rule_filters.remove_rules_above_cost_budget(rulesbyif: Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]], threshold: float = 0.5) Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]][source]

Filters rules based on cumulative cost threshold.

Parameters:
  • rulesbyif (Dict[ Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]] ]) – Dictionary mapping predicates to a dictionary of group IDs and associated cost, correctness, and cumulative cost tuples.

  • threshold (float, optional) – Threshold value for the cumulative cost. Rules with a cumulative cost value less than or equal to the threshold are kept. Defaults to 0.5.

Returns:

Dictionary containing the filtered rules based on the cumulative cost threshold.

Return type:

Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]]

humancompatible.explain.facts.rule_filters.remove_rules_below_correctness_threshold(rulesbyif: Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]], threshold: float = 0.5) Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]][source]

Filters the rules by correctness threshold.

Parameters:
  • rulesbyif (Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float]]]]]) – Dictionary mapping predicates to a dictionary of group IDs and associated cost and correctness tuples.

  • threshold (float, optional) – The threshold value for filtering the rules based on correctness. Defaults to 0.5.

Returns:

Filtered rules based on the correctness threshold.

Return type:

Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float]]]]]

Metrics

humancompatible.explain.facts.metrics.calculate_all_if_subgroup_costs(ifclauses: List[Predicate], all_thenclauses: List[Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]], group_calculator: Callable[[Predicate, List[Tuple[Predicate, float, float]]], float]) Dict[Predicate, Dict[str, float]][source]
humancompatible.explain.facts.metrics.calculate_if_subgroup_costs(ifclause: Predicate, thenclauses: Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]], group_calculator: Callable[[Predicate, List[Tuple[Predicate, float, float]]], float]) Dict[str, float][source]

Calculate the costs for each subgroup of a given if-clause.

Parameters:
  • ifclause – The if-clause predicate.

  • thenclauses – A dictionary mapping subgroup names to their corresponding coverage and then-clause predicates.

  • group_calculator – The function used to calculate the cost for each subgroup. Defaults to if_group_cost_min_change_correctness_threshold.

  • **kwargs – Additional keyword arguments to be passed to the group_calculator function.

Returns:

A dictionary mapping subgroup names to their calculated costs.

humancompatible.explain.facts.metrics.if_group_average_recourse_cost_conditional(ifclause: Predicate, thens: List[Tuple[Predicate, float, float]]) float[source]

Calculate the average recourse cost conditional on the correctness for a given if-clause and a list of then-clauses.

Parameters:
  • ifclause – The if-clause predicate.

  • thens – The list of then-clause predicates with their corresponding correctness and cost values.

  • params – The parameter proxy.

Returns:

The average recourse cost conditional on the correctness.

humancompatible.explain.facts.metrics.if_group_cost_max_correctness_cost_budget(ifclause: Predicate, then_corrs_costs: List[Tuple[Predicate, float, float]], cost_thres: float = 0.5) float[source]

Calculate the maximum correctness value for a given if-clause and a list of then-clauses with cost below a threshold.

Parameters:
  • ifclause – The if-clause predicate.

  • then_corrs_costs – The list of then-clause predicates with their corresponding correctness and cost values.

  • cor_thres – The correctness threshold.

  • cost_thres – The cost threshold. Only then-clauses with cost below this threshold will be considered.

  • params – The parameter proxy.

Returns:

The maximum correctness value.

humancompatible.explain.facts.metrics.if_group_cost_min_change_correctness_threshold(ifclause: Predicate, thens_corrs_costs: List[Tuple[Predicate, float, float]], cor_thres: float = 0.5) float[source]

Calculate the minimum feature change for a given if-clause and a list of then-clauses with a minimum correctness threshold.

Parameters:
  • ifclause – The if-clause predicate.

  • thenclauses – The list of then-clause predicates with their corresponding correctness values.

  • cor_thres – The minimum correctness threshold. Only then-clauses with a correctness value greater than or equal to this threshold will be considered.

  • params – The parameter proxy.

Returns:

The minimum feature change value.

humancompatible.explain.facts.metrics.if_group_cost_recoursescount_correctness_threshold(ifclause: Predicate, thens_corrs_costs: List[Tuple[Predicate, float, float]], cor_thres: float = 0.5) float[source]

Calculate the negative count of feature changes for a given if-clause and a list of then-clauses with a minimum correctness threshold.

Parameters:
  • ifclause – The if-clause predicate.

  • thenclauses – The list of then-clause predicates with their corresponding correctness values.

  • cor_thres – The minimum correctness threshold. Only then-clauses with a correctness value greater than or equal to this threshold will be considered.

  • params – The parameter proxy.

Returns:

The negative count of feature changes.

humancompatible.explain.facts.metrics.if_group_maximum_correctness(ifclause: Predicate, thens_corrs_costs: List[Tuple[Predicate, float, float]]) float[source]

Calculate the maximum correctness value for a given if-clause and a list of then-clauses.

Parameters:
  • ifclause – The if-clause predicate.

  • then_corrs_costs – The list of then-clause predicates with their corresponding correctness and cost values.

  • params – The parameter proxy.

Returns:

The maximum correctness value.

humancompatible.explain.facts.metrics.incorrectRecoursesIfThen(ifclause: Predicate, thenclause: Predicate, X_aff: DataFrame, model) int[source]

Compute the number of incorrect recourses given an if-then clause.

Parameters:
  • ifclause – The if-clause predicate.

  • thenclause – The then-clause predicate.

  • X_aff – The affected DataFrame.

  • model – The ML model under study. Expected to have a “predict” method.

Returns:

The number of incorrect recourses.

Raises:

ValueError – If there are no covered instances for the given if-clause.

humancompatible.explain.facts.metrics.max_intergroup_cost_diff(ifclause: Predicate, thenclauses: Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]], group_calculator: Callable[[Predicate, List[Tuple[Predicate, float, float]]], float]) float[source]

Calculate the maximum difference in subgroup costs for an if-clause and its corresponding then-clauses.

Parameters:
  • ifclause – The if-clause predicate.

  • thenclauses – A dictionary mapping subgroup names to their corresponding coverage, then-clause predicates, and costs.

  • **kwargs – Additional keyword arguments to be passed to the calculate_if_subgroup_costs function.

Returns:

The maximum difference in subgroup costs.

Optimization & Utilities

humancompatible.explain.facts.optimization.sort_triples_KStest(rulesbyif: Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]], affected_population_sizes: Dict[str, int]) Tuple[List[Tuple[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]]], Dict[Predicate, float]][source]

Sorts the triples using the Kolmogorov-Smirnov test to measure unfairness.

Parameters:
  • rulesbyif (Dict[ Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]] ]) – Dictionary mapping predicates to a dictionary of group IDs and associated cost, correctness, and predicate tuples.

  • affected_population_sizes (Dict[str, int]) – Dictionary mapping group IDs to their respective affected population sizes.

Returns:

A tuple containing a sorted list of triples and a dictionary mapping predicates to their unfairness scores.

Return type:

Tuple[ List[ Tuple[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]] ], Dict[Predicate, float], ]

humancompatible.explain.facts.optimization.sort_triples_by_max_costdiff(rulesbyif: Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]], ignore_nans: bool = False, ignore_infs: bool = False, secondary_objectives: List[str] = [], **kwargs) List[Tuple[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]]][source]

Sorts the triples by maximum cost difference with generic options to handle NaN, infinity, and secondary objectives.

Parameters:
  • rulesbyif (Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float]]]]]) – Dictionary mapping predicates to a dictionary of group IDs and associated cost and predicate pairs.

  • ignore_nans (bool, optional) – Flag indicating whether to ignore NaN values in the cost difference. Defaults to False.

  • ignore_infs (bool, optional) – Flag indicating whether to ignore infinity values in the cost difference. Defaults to False.

  • secondary_objectives (List[str], optional) – List of secondary objectives to include in the sorting criteria. Defaults to an empty list.

Returns:

Sorted list of triples with the associated maximum cost difference.

Return type:

List[Tuple[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float]]]]]]

humancompatible.explain.facts.utils.load_model(file: PathLike)[source]
Loads and returns a trained model from the specified file using

the pickle library.

Parameters:

file (PathLike) – The path to the file containing the model.

Returns:

The loaded trained model.

Return type:

ModelAPI

Raises:

None

humancompatible.explain.facts.utils.load_object(file: PathLike) object[source]
Loads and returns an object from the specified file using the pickle

library.

Parameters:

file (PathLike) – The path to the file containing the object.

Returns:

The loaded object.

Return type:

object

Raises:

None

humancompatible.explain.facts.utils.load_rules_by_if(file: PathLike) Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float]]]]][source]

Loads and returns a dictionary of rules.

Parameters:

file (PathLike) – The path to the file containing the rules.

Returns:

The dictionary of rules organized by the antecedent Predicate.

Return type:

Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float]]]]]

Raises:

None

humancompatible.explain.facts.utils.load_state(file: PathLike) Tuple[Dict, DataFrame, Any][source]
Loads and returns the rules, Dataframe, and a model from the specified

file using the pickle library.

Parameters:

file (PathLike) – The path to the file containing the state.

Returns:

A tuple containing the loaded rules,

DataFrame, and model.

Return type:

Tuple[Dict, DataFrame, ModelAPI]

Raises:

Nones

humancompatible.explain.facts.utils.load_test_data_used(file: PathLike) DataFrame[source]
Loads and returns the test data used from the specified file using the

pickle library.

Parameters:

file (PathLike) – The path to the file containing the test data.

Returns:

The loaded test data.

Return type:

DataFrame

Raises:

None

humancompatible.explain.facts.utils.save_model(file: PathLike, model) None[source]
Saves the provided model to the specified file using the pickle

library.

Parameters:
  • file (PathLike) – The path to the file where the model will be saved.

  • model (ModelAPI) – The model to be saved.

Raises:

None

humancompatible.explain.facts.utils.save_object(file: PathLike, o: object) None[source]

Saves the provided object to the specified file using the pickle library.

Parameters:
  • file (PathLike) – The path to the file where the object will be saved.

  • o (object) – The object to be saved.

Returns:

None

Raises:

None

humancompatible.explain.facts.utils.save_rules_by_if(file: PathLike, rules: Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float]]]]]) None[source]
Saves the provided rules dictionary to the specified file using the

pickle library.

Parameters:
  • file (PathLike) – The path to the file where the rules will be saved.

  • rules (Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float]]]]]) – The dictionary of rules

Raises:

None

humancompatible.explain.facts.utils.save_state(file: PathLike, rules: Dict, X: DataFrame, model) None[source]
Saves the rules, dataframe, model to the specified file using the pickle

library.

Parameters:
  • file (PathLike) – The path to the file where the data will be saved.

  • rules (Dict) – The rules dictionary to be saved.

  • X (DataFrame) – The DataFrame to be saved.

  • model (ModelAPI) – The model to be saved.

Raises:

None

humancompatible.explain.facts.utils.save_test_data_used(file: PathLike, X: DataFrame) None[source]
Saves the provided test data to the specified file using the pickle

library.

Parameters:
  • file (PathLike) – The path to the file where the test data will be saved.

  • X (DataFrame) – The test data to be saved.

Raises:

None

humancompatible.explain.facts.misc.aff_intersection_version_2(RLs_and_supports, subgroups, verbose=True)[source]

Compute the intersection of multiple sets of predicates and their corresponding supports.

Parameters:
  • RLs_and_supports (Dict[str, List[Tuple[Dict[str, any], float]]]) – Dictionary of predicates and their supports for each subgroup.

  • subgroups (List[str]) – List of subgroup names.

  • verbose (bool) – whether to print progress bar. Defaults to True.

Returns:

List of tuples containing the intersected predicates and their supports.

Return type:

List[Tuple[Predicate, Dict[str, float]]]

Raises:

ValueError – If there are fewer than 2 subgroups.

humancompatible.explain.facts.misc.affected_unaffected_split(X: DataFrame, model) Tuple[DataFrame, DataFrame][source]

Split the input data into affected and unaffected individuals.

Parameters:
  • X (pd.DataFrame) – The input data.

  • model (ModelAPI) – The model used for predictions.

Returns:

A tuple containing the affected individuals and unaffected individuals.

Return type:

Tuple[pd.DataFrame, pd.DataFrame]

humancompatible.explain.facts.misc.calc_costs(rules: ~typing.Dict[~humancompatible.explain.facts.predicate.Predicate, ~typing.Dict[str, ~typing.Tuple[float, ~typing.List[~typing.Tuple[~humancompatible.explain.facts.predicate.Predicate, float]]]]], params: ~humancompatible.explain.facts.parameters.ParameterProxy = ParameterProxy(featureChanges=defaultdict(<function make_default_featureChanges.<locals>.<lambda>>, {}))) Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]][source]
humancompatible.explain.facts.misc.calculate_correctnesses(ifthens_withsupp: List[Tuple[Predicate, Predicate, Dict[str, float]]], affected_by_subgroup: Dict[str, DataFrame], sensitive_attribute: str, model, verbose: bool = True) List[Tuple[Predicate, Predicate, Dict[str, float], Dict[str, float]]][source]

Calculate the correctness of recourse actions for each subgroup in a list of if-then rules.

Parameters:
  • ifthens_withsupp (List[Tuple[Predicate, Predicate, Dict[str, float]]]) – List of if-then rules with their support values.

  • affected_by_subgroup (Dict[str, DataFrame]) – Dictionary where keys are subgroup names and values are DataFrames representing affected individuals in each subgroup.

  • sensitive_attribute (str) – Name of the sensitive attribute in the dataset.

  • model (ModelAPI) – The model used for making predictions.

  • verbose (bool) – whether to print progress bar. Defaults to True.

Returns:

List of tuples containing the if-then rule, its support values, and the correctness of recourse actions for each subgroup.

Return type:

List[Tuple[Predicate, Predicate, Dict[str, float], Dict[str, float]]]

humancompatible.explain.facts.misc.cum_corr_costs(ifclause: ~humancompatible.explain.facts.predicate.Predicate, thenclauses: ~typing.List[~typing.Tuple[~humancompatible.explain.facts.predicate.Predicate, float]], X: ~pandas.core.frame.DataFrame, model, params: ~humancompatible.explain.facts.parameters.ParameterProxy = ParameterProxy(featureChanges=defaultdict(<function make_default_featureChanges.<locals>.<lambda>>, {}))) List[Tuple[Predicate, float, float]][source]

Calculate cumulative correctness and costs for the given if-then rules.

Parameters:
  • ifclause – The if-clause predicate.

  • thenclauses – A list of tuples containing then-clause predicates and their corresponding correctness values.

  • X – The DataFrame containing the data.

  • model – The model API used for prediction.

  • params – Optional parameter proxy (default: ParameterProxy()).

Returns:

A list of tuples containing the updated then-clause predicates, cumulative correctness values, and costs.

humancompatible.explain.facts.misc.cum_corr_costs_all(rulesbyif: ~typing.Dict[~humancompatible.explain.facts.predicate.Predicate, ~typing.Dict[str, ~typing.Tuple[float, ~typing.List[~typing.Tuple[~humancompatible.explain.facts.predicate.Predicate, float]]]]], X: ~pandas.core.frame.DataFrame, model, sensitive_attribute: str, params: ~humancompatible.explain.facts.parameters.ParameterProxy = ParameterProxy(featureChanges=defaultdict(<function make_default_featureChanges.<locals>.<lambda>>, {})), verbose: bool = True) Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]][source]

Calculate cumulative correctness and costs for all if-then rules.

Parameters:
  • rulesbyif – A dictionary containing if-clause predicates as keys and a nested dictionary as values. The nested dictionary contains subgroup names as keys, and tuples of coverage and a list of then-clause predicates with their corresponding correctness values as values.

  • X – The DataFrame containing the data.

  • model – The model API used for prediction.

  • sensitive_attribute – The name of the sensitive attribute in the data.

  • params – Optional parameter proxy (default: ParameterProxy()).

  • verbose – whether to print intermediate messages and progress bar. Defaults to True.

Returns:

A dictionary with if-clause predicates as keys. Each if-clause predicate maps to a nested dictionary where subgroup names are the keys, and tuples of coverage and a list of updated then-clause predicates with their cumulative correctness values and costs are the values.

humancompatible.explain.facts.misc.cum_corr_costs_all_minimal(rulesbyif: ~typing.Dict[~humancompatible.explain.facts.predicate.Predicate, ~typing.Dict[str, ~typing.Tuple[float, ~typing.List[~typing.Tuple[~humancompatible.explain.facts.predicate.Predicate, float]]]]], X: ~pandas.core.frame.DataFrame, model, sensitive_attribute: str, params: ~humancompatible.explain.facts.parameters.ParameterProxy = ParameterProxy(featureChanges=defaultdict(<function make_default_featureChanges.<locals>.<lambda>>, {}))) Dict[Predicate, Dict[str, List[Tuple[float, float]]]][source]

Compute cumulative correctness and cost for all rules.

Parameters:
  • rulesbyif – A dictionary containing if-clause predicates as keys and a nested dictionary as values. The nested dictionary contains subgroup names as keys, and tuples of coverage and a list of then-clause predicates with their corresponding correctness as values.

  • X – The input DataFrame.

  • model – The model API.

  • sensitive_attribute – The name of the sensitive attribute in the DataFrame.

  • params – Optional parameter proxy (default: ParameterProxy()).

Returns:

A dictionary containing if-clause predicates as keys and a nested dictionary as values. The nested dictionary contains subgroup names as keys, and a list of tuples representing the cumulative correctness and cost of the then-clauses.

humancompatible.explain.facts.misc.freqitemsets_with_supports(X: DataFrame, min_support: float = 0.01) Tuple[List[Predicate], List[float]][source]

Calculate frequent itemsets with their support values.

Parameters:
  • X (DataFrame) – The input data.

  • min_support (float, optional) – The minimum support threshold. Defaults to 0.01.

Returns:

A tuple containing the list of frequent itemsets and their support values.

Return type:

Tuple[List[Predicate], List[float]]

humancompatible.explain.facts.misc.rules2rulesbyif(rules: List[Tuple[Predicate, Predicate, Dict[str, float], Dict[str, float]]]) Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float]]]]][source]

Group rules based on the If clauses instead of protected subgroups.

Parameters:

rules (List[Tuple[Predicate, Predicate, Dict[str, float], Dict[str, float]]]) – List of tuples containing the if-then rules, coverage metrics, and correctness metrics.

Returns:

Dictionary containing the rules grouped by the If clauses.

Return type:

Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float]]]]]

humancompatible.explain.facts.misc.rulesbyif2rules(rules_by_if: Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float]]]]]) List[Tuple[Predicate, Predicate, Dict[str, float], Dict[str, float]]][source]

Convert rules grouped by If clauses to rules.

Parameters:

rules_by_if (Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float]]]]]) – Dictionary containing rules grouped by the If clauses.

Returns:

List of tuples containing the if-then rules, coverage metrics, and correctness metrics.

Return type:

List[Tuple[Predicate, Predicate, Dict[str, float], Dict[str, float]]]

humancompatible.explain.facts.misc.select_rules_subset(rulesbyif: Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]], metric: str = 'equal-effectiveness', sort_strategy: str = 'max-cost-diff-decr', top_count: int = 10, filter_sequence: List[str] = [], cor_threshold: float = 0.5, cost_threshold: float = 0.5, secondary_sorting_objectives: List[str] = []) Tuple[Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]], Dict[Predicate, Dict[str, float]]][source]

Selects a subset of rules.

Parameters:
  • rulesbyif – A dictionary mapping predicates to a dictionary of tuples containing cost, correctness, and cumulative cost values for each rule.

  • metric – The metric to use for sorting the rules (default: “equal-effectiveness”).

  • sort_strategy – The strategy to use for sorting the rules (default: “abs-diff-decr”).

  • top_count – The number of top rules to select (default: 10).

  • filter_sequence – A list of filtering criteria to apply to the rules (default: []).

  • cor_threshold – The correctness threshold for filtering rules (default: 0.5).

  • cost_threshold – The cost threshold for filtering rules (default: 0.5).

  • c_inf – The coefficient for infinity value in fairness-of-mean-recourse-cinf metric (default: 2).

  • secondary_sorting_objectives – A list of secondary objectives for sorting the rules (default: []).

  • params – A parameter proxy object (default: ParameterProxy()).

Returns:

A tuple containing the selected subset of rules and the costs of the then-blocks of the top rules.

humancompatible.explain.facts.misc.select_rules_subset_KStest(rulesbyif: Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]], affected_population_sizes: Dict[str, int], top_count: int = 10, filter_contained: bool = False) Tuple[Dict[Predicate, Dict[str, Tuple[float, List[Tuple[Predicate, float, float]]]]], Dict[Predicate, float]][source]

Selects a subset of rules based on the Kolmogorov-Smirnov (KS) test metric.

Parameters:
  • rulesbyif – A dictionary mapping predicates to a dictionary of tuples containing cost, correctness, and cumulative cost values for each rule.

  • affected_population_sizes – A dictionary mapping subgroup names to their sizes.

  • top_count – The number of top rules to select (default: 10).

  • filter_contained – Whether to filter contained rules (default: False).

Returns:

A tuple containing the selected subset of rules and the unfairness values calculated.

humancompatible.explain.facts.misc.valid_ifthens(X: DataFrame, model, sensitive_attribute: str, freqitem_minsupp: float = 0.01, missing_subgroup_val: str = 'N/A', drop_infeasible: bool = True, feats_not_allowed_to_change: List[str] = [], verbose: bool = True) List[Tuple[Predicate, Predicate, Dict[str, float], Dict[str, float]]][source]

Compute valid if-then rules along with their coverage and correctness metrics.

Parameters:
  • X (DataFrame) – Input data.

  • model (ModelAPI) – The model used for predictions.

  • sensitive_attribute (str) – The name of the sensitive attribute column in the dataset.

  • freqitem_minsupp (float) – Minimum support threshold for frequent itemset mining.

  • missing_subgroup_val (str) – Value indicating missing or unknown subgroup.

  • drop_infeasible (bool) – Whether to drop infeasible if-then rules.

  • feats_not_allowed_to_change (list[str]) – optionally, the user can provide some features which are not allowed to change at all (e.g. sex).

  • verbose (bool) – whether to print intermediate messages and progress bar. Defaults to True.

Returns:

List of tuples containing the valid if-then rules, coverage metrics, and correctness metrics.

Return type:

List[Tuple[Predicate, Predicate, Dict[str, float], Dict[str, float]]]

class humancompatible.explain.facts.parameters.ParameterProxy(featureChanges: ~typing.Dict[str, ~typing.Callable[[~typing.Any, ~typing.Any], float]] = <factory>)[source]

Bases: object

Proxy class for managing recourse parameters.

featureChanges: Dict[str, Callable[[Any, Any], float]]
setFeatureChange(fc: Dict)[source]

Set the feature changes.

Parameters:

fc (Dict) – A dictionary mapping feature names to their change functions.

humancompatible.explain.facts.parameters.default_change(v1, v2) float[source]

Compares two values and returns 0 if they are equal, and 1 if they are different.

Parameters:
  • v1 – The first value to be compared.

  • v2 – The second value to be compared.

Returns:

0 if the values are equal, 1 if the values are diff

Return type:

int

humancompatible.explain.facts.parameters.feature_change_builder(X: DataFrame | None, num_cols: List[str], cate_cols: List[str], ord_cols: List[str], feature_weights: Dict[str, int], num_normalization: bool = False, feats_to_normalize: List[str] | None = None) Dict[str, Callable[[Any, Any], float]][source]

Constructs a dictionary of feature change functions based on the input parameters.

Parameters:
  • X (DataFrame) – The input DataFrame containing the data.

  • num_cols (List[str]) – A list of column names representing the numeric features.

  • cate_cols (List[str]) – A list of column names representing the categorical features.

  • ord_cols (List[str]) – _description_

  • feature_weights (Dict[str, int]) – A dictionary mapping feature names to their corresponding weights.

  • num_normalization (bool, optional) – A flag indicating whether to normalize numeric features. Default is False.

  • feats_to_normalize (Optional[List[str]], optional) – A list of column names specifying the numeric features to be normalized. If None, all numeric features will be normalized. Default is None.

Returns:

A dictionary mapping feature names to the corresponding feature change functions.

Return type:

Dict[str, Callable[[Any, Any], int]]

humancompatible.explain.facts.parameters.make_default_featureChanges()[source]

Creates a defaultdict with a default value of the default_change function.

Returns:

A defaultdict with default_change as the default value.

Return type:

defaultdict

humancompatible.explain.facts.parameters.naive_feature_change_builder(num_cols: List[str], cate_cols: List[str], feature_weights: Dict[str, int]) Dict[str, Callable[[Any, Any], int]][source]

Builds a dictionary of feature change functions based on the provided lists of numerical and categorical columns, along with the weights for each feature.

Parameters:
  • num_cols (List[str]) – List of numerical column names.

  • cate_cols (List[str]) – List of categorical column names.

  • feature_weights (Dict[str, int]) – Dictionary mapping feature names to their weights.

Returns:

Dictionary of feature change functions.

Return type:

Dict[str, Callable[[Any, Any], int]]