FCX Unary Constraint Training (Adult)

humancompatible.explain.fcx.FCX_unary_generation_adult.compute_loss(model, model_out, x, target_label, normalise_weights, validity_reg, margin, adj_matrix, pred_model)[source]

Compute the combined ELBO, validity hinge-loss, sparsity penalty, and causal regularization for a batch of counterfactual examples.

This loss aggregates:
  • KL divergence between the encoder’s distribution and prior.

  • Reconstruction error (L1 proximity on mutable features).

  • Validity hinge-loss to enforce classifier flip.

  • Sparsity penalty to encourage minimal changes.

  • Causal regularization via a provided adjacency matrix.

Parameters:
  • model (FCX_VAE) – The counterfactual VAE model instance.

  • model_out (dict) –

    Outputs from model.forward, containing:
    • ’em’ (Tensor): encoder means, shape (batch, latent_dim)

    • ’ev’ (Tensor): encoder variances, shape (batch, latent_dim)

    • ’z’ (list[Tensor]): latent samples for each MC draw

    • ’x_pred’ (list[Tensor]): reconstructions per MC sample

    • ’mc_samples’ (int): number of Monte Carlo draws

  • x (Tensor) – Original input features, shape (batch, d).

  • target_label (Tensor) – True class labels (0 or 1), shape (batch,).

  • normalise_weights (dict[int, tuple(float, float)]) – Mapping feature index → (min, max) for proximity weighting.

  • validity_reg (float) – Weight of the validity hinge‐loss term.

  • margin (float) – Hinge loss margin hyperparameter.

  • adj_matrix (Tensor) – Binary causal adjacency matrix, shape (d, d).

Returns:

Scalar total loss combining KL, reconstruction, validity, sparsity, and causal regularization.

Return type:

Tensor

humancompatible.explain.fcx.FCX_unary_generation_adult.train_constraint_loss(model, train_dataset, optimizer, normalise_weights, validity_reg, constraint_reg, margin, epochs=1000, batch_size=1024, adj_matrix=None, pred_model=None)[source]

Perform one epoch of FCX_VAE training under causal and LOF constraints.

This routine:
  1. Wraps the raw train_dataset into a DataLoader for batching.

  2. For each batch: - Computes counterfactuals via model(train_x, train_y). - Calculates the combined ELBO + validity + sparsity + causal loss

    using compute_loss.

    • Adds a hinge-loss enforcing age monotonicity as a causal constraint.

    • Computes a Local Outlier Factor (LOF) penalty on the latent code.

    • Backpropagates and steps the optimizer.

  3. Accumulates and returns the total loss over all batches.

Parameters:
  • model (FCX_VAE) – The VAE-based counterfactual generator.

  • train_dataset (array-like) – Training data array of shape (N, d+1), with labels in the last column.

  • optimizer (torch.optim.Optimizer) – Optimizer instance for updating VAE parameters.

  • normalise_weights (dict[int, tuple(float, float)]) – Feature (min, max) pairs for proximity scaling.

  • validity_reg (float) – Weight for the validity hinge-loss term.

  • constraint_reg (float) – Weight for the causal age‐constraint hinge-loss.

  • margin (float) – Margin hyperparameter for all hinge-loss terms.

  • epochs (int, optional) – Number of epochs to train (currently only one epoch is run per call).

  • batch_size (int, optional) – Mini-batch size for the DataLoader.

  • adj_matrix (torch.Tensor, optional) – Binary causal adjacency matrix for compute_loss.

Returns:

The sum of all batch losses over the epoch.

Return type:

float

humancompatible.explain.fcx.FCX_unary_generation_adult.train_unary_fcx_vae(dataset_name, base_data_dir, base_model_dir, batch_size, epochs, validity, feasibility, margin)[source]

Train a unary FCX-VAE model to generate counterfactual explanations on the Adult dataset.

This function:
  1. Loads the Adult dataset.

  2. Loads normalization weights and MAD feature weights.

  3. Loads a pre-trained BlackBox classifier.

  4. Initializes and trains the FCX-VAE under causal (+LOF) constraints by calling train_constraint_loss for the specified number of epochs.

  5. Prints timing and loss statistics, and saves the final VAE checkpoint to disk.

Parameters:
  • dataset_name (str) – Name of the dataset (e.g. “adult”).

  • base_data_dir (str) – Path to the directory containing {dataset_name}-*.npy and weight JSONs.

  • base_model_dir (str) – Directory in which to read/write model .pth files.

  • batch_size (int) – Mini-batch size for VAE training.

  • epochs (int) – Number of training epochs to run.

  • validity (float) – Weight for the validity hinge-loss term.

  • feasibility (float) – Weight for the feasibility constraint.

  • margin (float) – Margin hyperparameter for hinge-losses.

Returns:

None

class humancompatible.explain.fcx.scripts.fcx_vae_model.FCX_VAE(data_size, encoded_size, d, immutables=-2)[source]

Bases: Module

Conditional Variational Autoencoder for generating feasible counterfactual explanations.

This model encodes an input feature vector x concatenated with a target class label c into a latent representation, then decodes back to a counterfactual example. Supports multiple Monte Carlo draws and computes ELBO and validity conditioned outputs.

immutables

Number of initial immutable features (suffix features are mutable).

Type:

int

encoded_size

Dimensionality of the latent code.

Type:

int

data_size

Number of input features.

Type:

int

encoded_categorical_feature_indexes

Index ranges for one-hot categorical features.

Type:

List[List[int]]

encoded_continuous_feature_indexes

Indices for continuous features.

Type:

List[int]

encoded_start_cat

Index where categorical features begin.

Type:

int

encoder_mean

Network mapping input → latent mean.

Type:

nn.Sequential

encoder_var

Network mapping input → latent variance.

Type:

nn.Sequential

decoder_mean

Network mapping latent + label → reconstructed features.

Type:

nn.Sequential

Initialize the FCX-VAE model.

Parameters:
  • data_size (int) – Number of encoded input features (excluding label).

  • encoded_size (int) – Dimensionality of the latent space.

  • d (DataLoader) – Provides feature metadata (categorical splits, decoding).

  • immutables (int, optional) – Count of immutable feature columns at the end of x. Defaults to -2 (last two features are immutable).

compute_elbo(x, c, pred_model, ret=False)[source]

Compute Evidence Lower Bound (ELBO) components for given inputs.

If ret is True, also return de-normalized originals, reconstructions, predicted labels, and latent code for analysis.

Parameters:
  • x (torch.Tensor) – Original features, may be truncated to mutable features if ret.

  • c (torch.Tensor) – Target labels, shape (batch_size,).

  • pred_model (nn.Module) – BlackBox classifier to evaluate validity.

  • ret (bool) – Whether to return extras for visualization.

Returns:

(log_px_z, kl_div, x, x_pred, cf_labels) If ret=True:

(log_px_z, kl_div, x_orig, x_pred_mod, cf_labels, z_code)

Return type:

If ret=False

decoder(z)[source]

Decode latent code back to feature space.

Parameters:

z (torch.Tensor) – Concatenated latent code and label, shape (batch_size, encoded_size+1).

Returns:

Reconstructed features, shape (batch_size, data_size).

Return type:

torch.Tensor

encoder(x)[source]

Encode input (x + label) into latent mean and log‑variance.

Parameters:

x (torch.Tensor) – Concatenated input and class label, shape (batch_size, data_size+1).

Returns:

Latent means, shape (batch_size, encoded_size). logvar (torch.Tensor): Latent log-variances, shape (batch_size, encoded_size).

Return type:

mean (torch.Tensor)

forward(x, c)[source]

Generate multiple Monte Carlo counterfactual draws.

Parameters:
  • x (torch.Tensor) – Original features, shape (batch_size, data_size).

  • c (torch.Tensor) – Target class label (0/1), shape (batch_size,).

Returns:

{

‘em’: encoder means, ‘ev’: encoder variances, ‘z’: list of sampled latent codes, ‘x_pred’: list of reconstructed counterfactuals, ‘mc_samples’: int number of MC draws

}

Return type:

dict

normal_likelihood(x, mean, logvar, raxis=1)[source]

Compute the log-probability of x under N(mean, logvar).

Parameters:
  • x (torch.Tensor) – Original or reconstructed features.

  • mean (torch.Tensor) – Latent means.

  • logvar (torch.Tensor) – Latent variances.

  • raxis (int) – Axis along which to sum log-likelihood.

Returns:

Log-likelihood per example.

Return type:

torch.Tensor

sample_latent_code(mean, logvar)[source]

Perform the reparameterization trick to sample latent code.

z = mean + sqrt(logvar) * eps, where eps ~ N(0, I).

Parameters:
  • mean (torch.Tensor) – Latent means.

  • logvar (torch.Tensor) – Latent log-variances.

Returns:

Sampled latent code.

Return type:

torch.Tensor

class humancompatible.explain.fcx.LOFLoss.LOFLoss(n_neighbors=20, epsilon=1e-08)[source]

Bases: Module

Local Outlier Factor (LOF) as a differentiable loss for anomaly detection.

This loss computes the average LOF score over a batch of input vectors, penalizing points that are outliers in the feature space.

Parameters:
  • n_neighbors (int) – Number of nearest neighbors to consider (excluding self) when computing reachability distances. Default is 20.

  • epsilon (float) – Small constant to ensure numerical stability (no zero distances or division by zero). Default is 1e-8.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input)[source]

Compute the LOF loss (mean LOF score) for the input batch.

Steps:
  1. Compute pairwise distances.

  2. Identify the k nearest neighbors for each point (excluding self).

  3. Compute reachability distances: max(distance_to_neighbor, neighbor_kth_distance).

  4. Compute local reachability density (LRD) for each point.

  5. Compute LOF score as the average ratio of neighbors’ LRD to own LRD.

  6. Return the mean LOF score over all points.

Parameters:

input (torch.Tensor) – Input tensor of shape (batch_size, feature_dim).

Returns:

Scalar tensor containing the mean LOF score across the batch.

Return type:

torch.Tensor

pairwise_distances(x)[source]

Compute the pairwise Euclidean distances between rows of x.

Parameters:

x (torch.Tensor) – Input tensor of shape (batch_size, feature_dim).

Returns:

A (batch_size × batch_size) distance matrix, where entry (i, j) is the Euclidean distance between x[i] and x[j].

Return type:

torch.Tensor

humancompatible.explain.fcx.scripts.datagen_func.prepare_datasets(dataset_name: str, base_dir: str = '../data/') None[source]

Prepare and save train/validation/test splits plus normalization metadata for a given dataset (Only in case the provided csv/npy files are missing or for custom solutions).

This will:
  1. Load one of: ‘adult’, ‘folktables_adult’, ‘census’, or ‘law’.

  2. Filter / slice the raw DataFrame as in your original logic.

  3. Compute median absolute deviations (MAD) and per-feature min/max.

  4. One‑hot encode, normalize continuous features, and reorder columns.

  5. Split into 10% test, 10% validation, and the rest training.

  6. Save: - {base_dir}/{dataset_name}-train-set.npy - {base_dir}/{dataset_name}-val-set.npy - {base_dir}/{dataset_name}-test-set.npy - {base_dir}/{dataset_name}-normalise_weights.json - {base_dir}/{dataset_name}-mad.json

Parameters:
  • dataset_name (str) – which dataset to prepare; one of ‘adult’, ‘folktables_adult’, ‘census’, or ‘law’.

  • base_dir (str) – directory under which to write out the .npy and .json files.