glance.utils package
Submodules
humancompatible.explain.glance.utils.action module
- humancompatible.explain.glance.utils.action.actions_mean_pandas(actions: DataFrame, numerical_features: List[str], categorical_features: List[str], categorical_no_action_token: Any) Series[source]
Computes the mean action for numerical features and the most frequent action for categorical features from a given actions DataFrame.
For numerical features, the function calculates the mean of the actions across all instances. For categorical features, it determines the most frequent value in the actions DataFrame, unless all values are equal to the categorical_no_action_token, in which case the token is returned.
Parameters:
- actionspd.DataFrame
A DataFrame where each row represents an instance, and each column represents an action for a feature (either numerical or categorical).
- numerical_featuresList[str]
List of columns in actions that are numerical features.
- categorical_featuresList[str]
List of columns in actions that are categorical features.
- categorical_no_action_tokenAny
A token or value that indicates no action is needed for categorical features.
Returns:
- pd.Series
A Series where: - For numerical features, the values are the mean of the actions for each numerical column. - For categorical features, the values are the most frequent action in each categorical column, or the categorical_no_action_token if no action was needed.
- humancompatible.explain.glance.utils.action.apply_action_numpy(X: ndarray[Any, dtype[number]], action: ndarray[Any, dtype[number]], numerical_columns: List[int], categorical_columns: List[int], categorical_no_action_token: number) ndarray[Any, dtype[number]][source]
Apply action to all rows of X. For numerical columns, add the respective component from action. For categorical columns, set the component of all rows to the value of action, unless it is equal to the categorical_no_action_token, in which case do nothing for this column.
Note: input array should have a numeric dtype. Thus, categorical columns should be encoded by numbers (e.g. Ordinal Encoding).
- Parameters:
X (npt.NDArray[np.number]) – matrix of observations
action (npt.NDArray[np.number]) – for each column / feature, the action to be applied
numerical_columns (List[int]) – numerical column indices
categorical_columns (List[int]) – categorical column indices
categorical_no_action_token (np.number) – special value signifying no-action (i.e. equivalent to 0 for numerical columns)
- Returns:
new observations resulting from the action application.
- Return type:
npt.NDArray[np.number]
- humancompatible.explain.glance.utils.action.apply_action_pandas(X: DataFrame, action: Series, numerical_columns: List[str], categorical_columns: List[str], categorical_no_action_token: Any, numerical_no_action_token: Any | None = None) DataFrame[source]
Apply action to all rows of X. For numerical columns, add the respective component from action. For categorical columns, set the component of all rows to the value of action, unless it is equal to the categorical_no_action_token, in which case do nothing for this column.
- Parameters:
X (pd.DataFrame) – matrix of observations
action (pd.Series) – for each column / feature, the action to be applied
numerical_columns (List[str]) – numerical column names
categorical_columns (List[str]) – categorical column names
categorical_no_action_token (Any) – special value signifying no-action (i.e. equivalent to 0 for numerical columns)
- Returns:
new observations resulting from the action application.
- Return type:
pd.DataFrame
- humancompatible.explain.glance.utils.action.apply_actions_pandas_rows(X: DataFrame, actions: DataFrame, numerical_columns: List[str], categorical_columns: List[str], categorical_no_action_token: object) DataFrame[source]
Applies a set of actions to transform the original dataset X based on the actions specified in the actions DataFrame.
For numerical columns, the function adds the values from the actions DataFrame to the corresponding columns in X. For categorical columns, if the action for a column is not equal to the categorical_no_action_token, the value from the actions DataFrame is used to update X. Otherwise, the original value from X is retained.
Parameters:
- Xpd.DataFrame
The original dataset, where each row represents an instance, and each column is a feature.
- actionspd.DataFrame
A DataFrame of the same shape as X, containing the actions to apply to each feature. - For numerical columns: contains the values to add to the corresponding features in X. - For categorical columns: contains either the new value to apply or the categorical_no_action_token.
- numerical_columnsList[str]
List of columns in X and actions that are numerical.
- categorical_columnsList[str]
List of columns in X and actions that are categorical.
- categorical_no_action_tokenobject
A token or value indicating that no action should be taken for a categorical feature.
Returns:
- pd.DataFrame
A DataFrame of the same shape as X where the actions have been applied: - For numerical columns: each value is updated by adding the corresponding action from actions. - For categorical columns: updated values from actions are used where applicable; otherwise, the original values from X are retained.
- humancompatible.explain.glance.utils.action.extract_actions_pandas(X: DataFrame, cfs: DataFrame, categorical_features: List[str], numerical_features: List[str], categorical_no_action_token: Any)[source]
Extracts the actions needed to convert the original dataset X into the counterfactual dataset cfs.
For categorical features, the function identifies changes between X and cfs. If no change is observed in a categorical feature, a specified categorical_no_action_token is used to denote that no action is needed. For numerical features, the function computes the difference between the counterfactual and the original values.
Parameters:
- Xpd.DataFrame
The original dataset, where each row represents an instance, and each column is a feature.
- cfspd.DataFrame
The counterfactual dataset, which has the same structure as X. It represents the desired state after some action is applied.
- categorical_featuresList[str]
List of columns in X and cfs that are categorical.
- numerical_featuresList[str]
List of columns in X and cfs that are numerical.
- categorical_no_action_tokenAny
A token or value to insert into categorical features where no change is needed (i.e., the feature value in X is the same as in cfs).
Returns:
- pd.DataFrame
A DataFrame of the same shape as X and cfs where each value indicates the action required to transform X into cfs: - For categorical features: the value in cfs if it differs from X, otherwise categorical_no_action_token. - For numerical features: the difference between cfs and X.
glance.utils.centroid module
- humancompatible.explain.glance.utils.centroid.centroid_numpy(X: ndarray[Any, dtype[number]], numerical_columns: List[int], categorical_columns: List[int]) ndarray[Any, dtype[number]][source]
Calculates the centroid of the rows of a 2d numy array. Specifically, for the numerical_columns columns, the centroid has value the mean of all rows, while for the categorical_columns columns, the centroid has value the mode of all rows.
- Parameters:
X (npt.NDArray[np.number]) – matrix of observations
numerical_columns (List[int]) – numerical column indices
categorical_columns (List[int]) – categorical column indices
- Returns:
2d numpy array whose single row is the centroid
- Return type:
npt.NDArray[np.number]
- humancompatible.explain.glance.utils.centroid.centroid_pandas(X: DataFrame, numerical_columns: List[str], categorical_columns: List[str]) DataFrame[source]
Calculates the centroid of the rows of a pandas DataFrame. Specifically, for the numerical_columns columns, the centroid has value the mean of all rows, while for the categorical_columns columns, the centroid has value the mode of all rows.
- Parameters:
X (pd.DataFrame) – matrix of observations
numerical_columns (List[str]) – numerical column names
categorical_columns (List[str]) – categorical column names
- Returns:
DataFrame whose single row is the centroid
- Return type:
pd.DataFrame