FCX Unary Evaluation (Adult)

—

humancompatible.explain.fcx.evaluate_unary_adult.evaluate_adult(base_data_dir: str = 'data/', base_model_dir: str = 'models/', dataset_name: str = 'adult', pth_name: str = 'models/adult_binary.pth') → dict[source]

Load the Adult test split, the pre‑trained black-box classifier and FCX-VAE model, generate counterfactuals, and compute evaluation metrics.

This wrapper runs the full evaluation pipeline for the Adult dataset, returning a dictionary of metrics including validity, causal feasibility, continuous and categorical proximity, and LOF anomaly scores.

Parameters:

base_data_dir (str) – Path to the directory containing the preprocessed test split (.npy files) and JSON weight files.
base_model_dir (str) – Directory where the black-box model and VAE checkpoints are saved.
dataset_name (str) – Dataset identifier (default ‘adult’), used to name files like {dataset_name}-test-set.npy and {dataset_name}.pth.
pth_name (str) – Filename of the trained VAE checkpoint relative to base_model_dir (default ‘models/adult_binary.pth’).

Returns:

A mapping { dataset_name: metrics_dict }, where metrics_dict has keys ‘validity’, ‘const-score’, ‘cont-prox’, ‘cat-prox’, and ‘LOF’, each containing the computed score(s) for the FCX-VAE counterfactuals.

Return type:

dict

humancompatible.explain.fcx.scripts.evaluation_functions_adult.causal_score_age_constraint(model, pred_model, train_dataset, d, normalise_weights, offset, case, sample_range)[source]

Compute feasibility scores based on a monotonic age constraint for counterfactuals.

For each sample_size in sample_range, this function:

Calls model.compute_elbo to generate sample_size counterfactuals per example.
De-normalizes predictions (x_pred) and originals (x_true) via the DataLoader.
Checks that the counterfactual age ≥ original age plus a scaled offset.
Computes the percentage of valid and invalid age‐constraint satisfactions.

Parameters:

model (FCX_VAE) – The trained counterfactual VAE, providing compute_elbo(…).
pred_model (BlackBox) – Pre‑trained classifier used to determine target labels.
train_dataset (np.ndarray) – Array of shape (N, d+1), where the last column is the true label.
d (DataLoader) –
DataLoader instance with methods:
- get_decoded_data(…) to map back to original feature names,
- de_normalize_data(…) to apply inverse scaling.
normalise_weights (dict[int, tuple(float, float)]) – Per‑feature (min, max) values for de-scaling.
offset (float) – Raw offset to subtract (via de_scale(offset, normalise_weights[0])) when comparing ages.
case (bool or int) – If truthy, plot valid vs invalid percentages (unused by default).
sample_range (list[int]) – List of Monte Carlo sample counts to evaluate (e.g., [1, 2, 3]).

Returns:

valid_score_arr (list[float]): Percentage of counterfactuals satisfying the age constraint, for each sample_size.
invalid_score_arr (list[float]): Percentage violating the age constraint, for each sample_size.

Return type:

tuple

humancompatible.explain.fcx.scripts.evaluation_functions_adult.causal_score_age_constraint_lof(model, pred_model, train_dataset, d, normalise_weights, offset, case, sample_range, prefix_name='test')[source]

Compute a combined age-education-constraint violation penalty and LOF anomaly score for generated counterfactuals (LOF for feasible cf examples).

For each sample in train_dataset, this function:

Generates counterfactuals via the FCX‑VAE model conditioned to flip the pred_model label.
Measures any violations of the monotonic age constraint relative to the original age feature (i.e. that age should not decrease).
Computes a Local Outlier Factor (LOF) anomaly score on the latent encodings.
Returns a single combined score per sample.

Parameters:

model (FCX_VAE) – Trained counterfactual VAE.
pred_model (BlackBox) – Pre-trained classifier used for validity checks.
train_dataset (np.ndarray) – Original examples with labels, shape (N, num_features+1).
d (DataLoader) – DataLoader instance providing feature splits and metadata.
normalise_weights (dict[int, tuple(float, float)]) – Per-feature (min, max) weights for proximity and scaling.
offset (int) – Index offset indicating how many initial features to treat as immutable (used to locate the age feature).
case (int) – Metric case identifier (currently unused).
sample_range (list[int]) – Indices of Monte Carlo samples to generate counterfactuals.
prefix_name (str, optional) – Prefix for any output logging or filenames (default: ‘test’).

Returns:

LOF scores.

Return type:

List[float]

humancompatible.explain.fcx.scripts.evaluation_functions_adult.causal_score_age_ed_constraint(model, pred_model, train_dataset, d, normalise_weights, offset, case, sample_range)[source]

Compute feasibility scores based on age and education monotonicity constraints for binary counterfactuals.

For each sample_size in sample_range, this function:

Generates sample_size counterfactuals per example via model.compute_elbo, conditioned to flip pred_model’s output.
De-normalizes counterfactuals (x_pred) and originals (x_true) with d.get_decoded_data and d.de_normalize_data.
Checks two monotonic constraints for each counterfactual: - Age constraint: CF age ≥ original age + de-scaled offset. - Education constraint: CF education level ≥ original education level + de-scaled offset.
Computes the percentage of valid vs. invalid counterfactuals for each sample_size.

Parameters:

model (FCX_VAE) – Trained counterfactual VAE model providing compute_elbo(…).
pred_model (BlackBox) – Pre‑trained classifier used to determine target labels.
train_dataset (np.ndarray) – Array of original examples with labels, shape (N, d+1).
d (DataLoader) – DataLoader instance for decoding and de-normalizing features.
normalise_weights (dict[int, tuple(float, float)]) – Per-feature (min, max) values for scaling/offset computations.
offset (float) – Raw offset applied when checking monotonic constraints.
case (int or bool) – If truthy, may trigger optional plotting (unused by default).
sample_range (list[int]) – Monte Carlo sample counts to evaluate (e.g., [1, 2, 3]).

Returns:

valid_score_arr (list[float]): Percentage of counterfactuals satisfying both age and education constraints, for each sample_size.
invalid_score_arr (list[float]): Percentage violating at least one constraint, for each sample_size.

Return type:

tuple

humancompatible.explain.fcx.scripts.evaluation_functions_adult.causal_score_age_ed_constraint_lof(model, pred_model, train_dataset, d, normalise_weights, offset, case, sample_range, prefix_name='test')[source]

Compute a combined age and education constraint violation penalty with LOF anomaly score for binary counterfactuals.

For each sample_size in sample_range, this function:

Generates sample_size counterfactuals per example via model.compute_elbo, conditioned to flip pred_model’s output.
De-normalizes the counterfactuals and originals using d.get_decoded_data and d.de_normalize_data.
Evaluates two monotonic constraints: - Age constraint: CF age ≥ original age + de-scaled offset. - Education constraint: CF education level ≥ original education level (using offset and normalise_weights).
Computes a Local Outlier Factor (LOF) anomaly score on the VAE’s latent codes.
Combines the two constraint violation rates with the LOF score into a single numeric score per example.

Parameters:

model (FCX_VAE) – Trained counterfactual VAE model.
pred_model (BlackBox) – Pre‑trained classifier used for validity conditioning.
train_dataset (np.ndarray) – Array of examples + labels, shape (N, d+1).
d (DataLoader) – DataLoader instance for de-coding and de-normalizing features.
normalise_weights (dict[int, tuple(float, float)]) – Per-feature (min, max) values for scaling.
offset (float) – Raw offset applied when checking monotonic constraints.
case (int or bool) – If truthy, may trigger plotting (unused by default).
sample_range (list[int]) – Monte Carlo sample counts to evaluate (e.g., [1, 2, 3]).
prefix_name (str, optional) – Prefix for any saved outputs or logs (default: ‘test’).

Returns:

LOF score

Return type:

list[float]

humancompatible.explain.fcx.scripts.evaluation_functions_adult.compute_eval_metrics_adult(immutables, methods, base_model_dir, encoded_size, pred_model, val_dataset, d, normalise_weights, mad_feature_weights, div_case, case, sample_range, filename, prefix_name='samples')[source]

Compute a specified evaluation metric for each trained FCX-VAE model on the Adult dataset.

For each entry in methods, this function:

Loads the corresponding VAE checkpoint.
Samples counterfactuals on the held-out validation set.
Computes a single metric given by case: - 0: validity (fraction of CFs that flip the classifier) - 1: feasibility (constraint score) - 2: continuous proximity - 3: categorical proximity - 4: LOF anomaly score
Stores the result in a dictionary under the method’s key.

Parameters:

immutables (bool) – If True, treats the last 4 features as immutable during CF generation.
methods (dict[str, str]) – Mapping from method name to VAE checkpoint filepath.
base_model_dir (str) – Directory where model checkpoints are stored.
encoded_size (int) – Latent dimensionality of the FCX‑VAE.
pred_model (BlackBox) – Pre-trained black-box classifier for validity checking.
val_dataset (np.ndarray) – Validation data array including labels, shape (N, d+1).
d (DataLoader) – DataLoader instance providing feature encodings and metadata.
normalise_weights (dict[int, tuple[float, float]]) – Per-feature (min, max) weights for proximity calculations.
mad_feature_weights (dict[int, Any]) – Feature weights used for LOF anomaly scoring.
div_case (int) – Diversity case identifier (unused in basic metrics).
case (int) –

Metric case index to compute:
0=validity, 1=feasibility, 2=cont-prox, 3=cat‑prox, 4=LOF.
sample_range (list[int]) – Indices of Monte Carlo samples to evaluate.
filename (str) – Base name for saving any output plots or arrays.
prefix_name (str, optional) – Prefix for naming sample output files (default: ‘samples’).

Returns:

Mapping each method name to its computed metric value (float).

Return type:

dict

humancompatible.explain.fcx.scripts.evaluation_functions_adult.de_normalise(x, normalise_weights)[source]

Map a normalized feature value back to its original scale.

Parameters:

x (float or array-like) – The normalized value(s) in the range [0, 1].
normalise_weights (tuple[float, float]) – A (min, max) tuple giving the original feature’s range.

Returns:

The de-normalized value(s), scaled back to [min, max].

Return type:

float or array-like

humancompatible.explain.fcx.scripts.evaluation_functions_adult.de_scale(x, normalise_weights)[source]

Compute the absolute scale (range) of a normalized value.

Given a normalized offset x in [0, 1], and a feature’s original (min, max) range, returns the corresponding absolute difference.

Parameters:

x (float) – The normalized offset (e.g., 0.05 represents 5% of the range).
normalise_weights (tuple[float, float]) – A (min, max) tuple for the feature’s original range.

Returns:

The absolute offset in the original scale (i.e., (max - min) * x).

Return type:

float

humancompatible.explain.fcx.scripts.evaluation_functions_adult.lof_score_func(data)[source]

Compute the average Local Outlier Factor (LOF) anomaly score for a dataset.

This function fits a LocalOutlierFactor model (with 20 neighbors and Euclidean distance) on the input data and returns the mean LOF score across all samples. Higher LOF scores indicate a greater degree of anomaly.

Parameters:: data (array-like of shape (n_samples, n_features)) – The input data on which to compute anomaly scores.
Returns:: The average LOF anomaly score (mean of -negative_outlier_factor_) over all samples.
Return type:: float

humancompatible.explain.fcx.scripts.evaluation_functions_adult.proximity_score(model, pred_model, train_dataset, d, mad_feature_weights, cat, case, sample_range)[source]

Compute continuous or categorical L1 proximity between originals and counterfactuals.

For each sample_size in sample_range, this function:

Generates sample_size counterfactuals per example via model.compute_elbo, conditioned to flip pred_model’s output.
De-normalizes both originals (x_true) and counterfactuals (x_pred) using d.get_decoded_data and d.de_normalize_data.
Computes L1 distances: - If cat is False: sums absolute differences over continuous features. - If cat is True: sums absolute differences over one-hot encoded categorical features.
Averages distances across MC samples and examples.

Parameters:

model (FCX_VAE) – Trained counterfactual VAE providing compute_elbo(…).
pred_model (BlackBox) – Pre‑trained classifier for validity conditioning.
train_dataset (np.ndarray) – Array of original examples with labels, shape (N, d+1).
d (DataLoader) – DataLoader instance for decoding and de-normalizing features.
mad_feature_weights (dict[int, float]) – Mean absolute deviation weights for continuous features.
cat (bool) – If False, compute continuous proximity; if True, compute categorical proximity.
case (int or bool) – If truthy, may trigger plotting (unused by default).
sample_range (list[int]) – Monte Carlo sample counts to evaluate (e.g., [1, 2, 3]).

Returns:

Proximity scores (average L1 distances)

Return type:

list[float]

humancompatible.explain.fcx.scripts.evaluation_functions_adult.validity_score(model, pred_model, train_dataset, case, sample_range, d=None)[source]

Compute the validity of generated counterfactuals as the percentage that successfully flip the black‑box classifier’s prediction.

For each sample_size in sample_range, this function:

Generates sample_size counterfactuals per example via model.compute_elbo, conditioned on the opposite class from pred_model.
Applies pred_model to each generated counterfactual.
Computes the fraction of counterfactuals whose predicted label differs from the original label.

Parameters:

model (FCX_VAE) – Trained counterfactual VAE providing compute_elbo(…).
pred_model (BlackBox) – Pre‑trained classifier used to judge validity of counterfactuals.
train_dataset (np.ndarray) – Array of original examples with labels, shape (N, d+1).
case (int or bool) – If truthy, may trigger optional plotting (unused by default).
sample_range (list[int]) – Monte Carlo sample counts to evaluate (e.g., [1, 2, 3]).
d (DataLoader, optional) – DataLoader instance for decoding or de-normalizing data (if needed).

Returns:

Validity percentages (0–100)

Return type:

list[float]