question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unbounded memory usage in tensordot

See original GitHub issue

What happened:

Following #6846 the memory usage of tensordot (and dot, which delegates to tensordot in Dask) is much higher than before, and grows as a function of array size.

What you expected to happen:

The memory usage should be related to chunk size (as it was previously), not the size of the array. This was achieved previously by avoiding concatenate=True in the call to blockwise from tensordot. As a general rule, concatenate=True should be avoided since it causes these memory issues for very large inputs.

Minimal Complete Verifiable Example:

The “Multiplication Only” part of this notebook - with X.T @ Y replaced with da.dot(X.T, Y) - demonstrates the problem. The memory usage should be flat, not growing with the size of the array.

Anything else we need to know?:

#6874 is a related issue that aims to remove the same memory issue from matmul.

Environment:

  • Dask version: latest head (unreleased)
  • Python version: 3.7.6
  • Operating System: MacOS
  • Install method (conda, pip, source): source

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:10 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
dcheriancommented, Mar 24, 2022

I think this can be closed now.

1reaction
ravwojdylacommented, Dec 3, 2020

From https://github.com/dask/dask/pull/6846#issuecomment-735763407:

We could have a special case for sparse array (based on type) that introduces the contraction in _tensordot only for sparse arrays? But I wonder if there is a more generic solution 🤔

Afaiu https://github.com/dask/dask/pull/6924 also adds a special case but just for {1,2}d. Should I give it a try to produce a PR for the sparse array special case, to compare? Wdyt @tomwhite and the rest?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Deepak Cherian on Twitter: "Everything works well after a major ...
Quick fix for unbounded memory usage in tensordot by GenevieveBuckley · Pull Request #7980 ·... This is a quick fix for issue #6916...
Read more >
NumPy tensordot MemoryError - python - Stack Overflow
On the final line, Python simply stops and says "MemoryError". How can I get around this, either by changing some setting in Python...
Read more >
CZI EOSS Update - Dask Working Notes
dask PR #7950 (ongoing): This PR aims to improve memory and performance of the tensordot function with auto-rechunking of Dask arrays. dask PR...
Read more >
PyTorch v1.5.0 Now Available| Exxact Blog
Channels Last tensors are ordered in memory in such a way that channels become the densest dimension (aka storing images pixel-per-pixel).
Read more >
FAQ - SciPy wiki dump
Why use numpy rather than IDL, MATLAB, Octave, or Yorick? ... same number of bytes in memory. numpy cannot use double-indirection to access...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found