Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

_partial_dependence_brute uses unnecessary memory

See original GitHub issue

Describe the bug

When users run _partial_dependence_brute from inspection_partial_dependence.py, the following code is executed for every grid point (usually 100 times), in lines 149-150:

 for new_values in grid:
     X_eval = X.copy()

Since each loop overwrites a single feature at a time, it would be more efficient to move the X.copy() prior to the loop. Unless the model itself mutates the dataset, I don’t think there’s a reason to make a new copy on every loop.

For large datasets, this causes extreme memory use.

Steps/Code to Reproduce

N/A

Expected Results

Expected results are actual results, but with more memory use.

Actual Results

Expected results are actual results, but with more memory use.

Versions

Since 0.23, or whenever partial_dependence was introduced.

Issue Analytics

State:
Created 2 years ago
Comments:5 (5 by maintainers)

Top GitHub Comments

2reactions

glemaitrecommented, Dec 9, 2021

Ups my bad, I went a bit more in-depth in the code and you are right. At this stage in the code, we are handling a single interaction and we will always overwrite the same targeted feature(s). I thought that the features here was corresponding to loop over all the possible interactions.

We can safely put the copy outside of the loop then. If the garbage collection was working as expected then we should not have any gain regarding the memory usage. However, we avoid triggering copies which could be a potential speed-up then.

0reactions

jimbudarzcommented, Dec 10, 2021

My only concern would be if the model is a pipeline which modifies the dataframe inplace during prediction.

Top Results From Across the Web

Partial dependence plots (PDP) and individual conditional ...

The 'brute' and 'recursion' methods will likely disagree regarding the value of the partial dependence, because they will treat these unlikely samples ...

8.1 Partial Dependence Plot (PDP) | Interpretable Machine ...

Partial dependence works by marginalizing the machine learning model output over the distribution of the features in set C, so that the function...

Sustained Space and Cumulative Complexity Trade-o s for ...

In this paper, our focus will be on understanding and quantifying SSC and. CMC trade-offs for data-dependent memory-hard Functions using dynamic graphs and ......

Fooling Partial Dependence via Data Poisoning - Medium

TL;DR: We highlight that Partial Dependence can be maliciously altered, e.g. bent and shifted, with adversarial data perturbations.

Rendering Mechanism - Vue.js

Patch: When a dependency used during mount changes, the effect re-runs. ... created for them on each re-render, resulting in unnecessary memory pressure....