Unbounded memory usage in tensordot
See original GitHub issueWhat happened:
Following #6846 the memory usage of tensordot
(and dot
, which delegates to tensordot
in Dask) is much higher than before, and grows as a function of array size.
What you expected to happen:
The memory usage should be related to chunk size (as it was previously), not the size of the array. This was achieved previously by avoiding concatenate=True
in the call to blockwise
from tensordot
. As a general rule, concatenate=True
should be avoided since it causes these memory issues for very large inputs.
Minimal Complete Verifiable Example:
The “Multiplication Only” part of this notebook - with X.T @ Y
replaced with da.dot(X.T, Y)
- demonstrates the problem. The memory usage should be flat, not growing with the size of the array.
Anything else we need to know?:
#6874 is a related issue that aims to remove the same memory issue from matmul
.
Environment:
- Dask version: latest head (unreleased)
- Python version: 3.7.6
- Operating System: MacOS
- Install method (conda, pip, source): source
Issue Analytics
- State:
- Created 3 years ago
- Comments:10 (7 by maintainers)
I think this can be closed now.
From https://github.com/dask/dask/pull/6846#issuecomment-735763407:
Afaiu https://github.com/dask/dask/pull/6924 also adds a special case but just for {1,2}d. Should I give it a try to produce a PR for the sparse array special case, to compare? Wdyt @tomwhite and the rest?