Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Strange sparsity results

See original GitHub issue

Hi!

I’ve noticed some potentially wrong sparisty (L0/Distance_1) results due to some very small numbers.

When running Carla’s benchmarking code for the cem-vae method and the first 10 test observations:

from carla.data.catalog import OnlineCatalog
from carla.models.catalog import MLModelCatalog
from carla.models.negative_instances import predict_negative_instances
from carla import Benchmark

import carla.recourse_methods.catalog as recourse_catalog

import torch

dataset = OnlineCatalog("adult")

torch.manual_seed(0)
n_test = 10
ml_model = MLModelCatalog(
        dataset, 
        model_type="ann", 
        load_online=False, 
        backend="pytorch"
    )

ml_model.train(
    learning_rate=0.002,
    epochs=20,
    batch_size=1024,
    hidden_size=[18, 9, 3],
    force_train=True, 
)

hyperparams = {
    "data_name": "adult",
    "batch_size": 1,
    "kappa": 0.1,
    "init_learning_rate": 0.01,
    "binary_search_steps": 9,
    "max_iterations": 100,
    "initial_const": 10,
    "beta": 0.9,
    "gamma": 1.0, # 0.0, #   1.0
    "mode": "PN",
    "num_classes": 2,
    "ae_params": {"hidden_layer": [20, 10, 7], "train_ae": True, "epochs": 5},
}

from tensorflow import Graph, Session

graph = Graph()
with graph.as_default():
    ann_sess = Session()
    with ann_sess.as_default():
        ml_model_sess = MLModelCatalog(dataset, "ann", "tensorflow")

        factuals_sess = predict_negative_instances(
            ml_model_sess, dataset.df
        )
        factuals_sess = factuals_sess.iloc[:n_test].reset_index(drop=True)

        cem = recourse_catalog.CEM(ann_sess, ml_model_sess, hyperparams)
        df_cfs = cem.get_counterfactuals(factuals_sess)
        benchmark = Benchmark(ml_model, cem, factuals_sess)

distances = benchmark.compute_distances()

distances.Distance_1[0] # equal to 5

I get that the first sparsity/Distance_1 is equal to 5. When printing our the factual and counterfactual for this test observation, I get that the two vectors are almost the same (the only difference is ‘capital-gain’).

The reason for this problem is that the distance_1 code looks something like this

import numpy as np

arr_f = ml_model.get_ordered_features(benchmark._factuals).to_numpy()
arr_cf = ml_model.get_ordered_features(
    benchmark._counterfactuals
).to_numpy()

delta = arr_f - arr_cf

d1 = np.sum(delta != 0, axis=1, dtype=np.float).tolist()

For the first observation, delta (the difference between the factual and the counterfactual) has really small (but not zero) numbers:

Which leads to a wrong calculation of d1.

Any suggestions on how to fix this delta/rounding problem?

Thanks!

Issue Analytics

State:
Created a year ago
Comments:8 (3 by maintainers)

Top GitHub Comments

1reaction

JohanvandenHeuvelcommented, May 5, 2022

Yeah that sounds good.

0reactions

JohanvandenHeuvelcommented, May 10, 2022

That’s no problem at all!

Top Results From Across the Web

concat() on Sparse dataframe returns strange results #12174

I open a stackoverflow question here : http://stackoverflow.com/questions/35083277/pandas-concat-on-sparse-dataframes-a-mystery And someone ask me to open ...

numpy.square returns incorrect result for sparse matrices

In general, passing in scipy.sparse matrices into numpy functions that take arrays ("array_like") as input, results to undefined/unintended behavior.

Sparsity May Cry: Let Us Fail (Current) Sparse Neural ...

This paper provides a new benchmark and results for evaluating sparsity methods on diverse tasks called SMC-Bench.

Explorability and the origin of network sparsity in living systems

We show that sparsity is an emergent property resulting from optimising both explorability and dynamical robustness, i.e. the capacity of the ...

A strange result of sparse matrix addition with mol_sparse_s_add ...

Hi all， I have a new question about sparse matrix addition routine mkl_sparse_s_add, it return a strange result. And when I using double...