question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

_partial_dependence_brute uses unnecessary memory

See original GitHub issue

Describe the bug

When users run _partial_dependence_brute from inspection_partial_dependence.py, the following code is executed for every grid point (usually 100 times), in lines 149-150:

 for new_values in grid:
     X_eval = X.copy()

Since each loop overwrites a single feature at a time, it would be more efficient to move the X.copy() prior to the loop. Unless the model itself mutates the dataset, I don’t think there’s a reason to make a new copy on every loop.

For large datasets, this causes extreme memory use.

Steps/Code to Reproduce

N/A

Expected Results

Expected results are actual results, but with more memory use.

Actual Results

Expected results are actual results, but with more memory use.

Versions

Since 0.23, or whenever partial_dependence was introduced.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
glemaitrecommented, Dec 9, 2021

Ups my bad, I went a bit more in-depth in the code and you are right. At this stage in the code, we are handling a single interaction and we will always overwrite the same targeted feature(s). I thought that the features here was corresponding to loop over all the possible interactions.

We can safely put the copy outside of the loop then. If the garbage collection was working as expected then we should not have any gain regarding the memory usage. However, we avoid triggering copies which could be a potential speed-up then.

0reactions
jimbudarzcommented, Dec 10, 2021

My only concern would be if the model is a pipeline which modifies the dataframe inplace during prediction.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Partial dependence plots (PDP) and individual conditional ...
The 'brute' and 'recursion' methods will likely disagree regarding the value of the partial dependence, because they will treat these unlikely samples ...
Read more >
8.1 Partial Dependence Plot (PDP) | Interpretable Machine ...
Partial dependence works by marginalizing the machine learning model output over the distribution of the features in set C, so that the function...
Read more >
Sustained Space and Cumulative Complexity Trade-o s for ...
In this paper, our focus will be on understanding and quantifying SSC and. CMC trade-offs for data-dependent memory-hard Functions using dynamic graphs and ......
Read more >
Fooling Partial Dependence via Data Poisoning - Medium
TL;DR: We highlight that Partial Dependence can be maliciously altered, e.g. bent and shifted, with adversarial data perturbations.
Read more >
Rendering Mechanism - Vue.js
Patch: When a dependency used during mount changes, the effect re-runs. ... created for them on each re-render, resulting in unnecessary memory pressure....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found