Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

CI failure on test_auto_normalize_collection_sync

See original GitHub issue

The following test appears to be failing on CI.

_____________________ test_auto_normalize_collection_sync _____________________

c = <Client: 'tcp://127.0.0.1:58347' processes=2 threads=2, memory=15.03 GB>

    def test_auto_normalize_collection_sync(c):
        da = pytest.importorskip("dask.array")
        x = da.ones(10, chunks=5)
    
        y = x.map_blocks(slowinc, delay=1, dtype=x.dtype)
        yy = c.persist(y)
    
        wait(yy)
    
        with dask.config.set(optimizations=[c._optimize_insert_futures]):
            start = time()
            y.sum().compute()
            end = time()
>           assert end - start < 1
E           assert (1594691186.7301342 - 1594691185.6998608) < 1

distributed\tests\test_client.py:4354: AssertionError

Maybe we just need to bump the threshold a little bit. Though I’m not sure if that defeats the point of the test.

Issue Analytics

State:
Created 3 years ago
Comments:7 (7 by maintainers)

Top GitHub Comments

1reaction

quasibencommented, Jul 14, 2020

It’s probably good to give an “incident” report here. What I think happened was that in PR
https://github.com/dask/dask/pull/6382, collections_to_dsk called the optimizer on the graph during the grouping operation where we want to group the same optimizations on a list of graphs and keys. Doing this prematurely resulted in an already optimized graph which broken test_auto_normalize_collection_sync test.

Normally, during this test we would see a list of keys like the following:

[('ones-c4a83f4b990021618d55e0fa61a351d6', 0),
 ('ones-c4a83f4b990021618d55e0fa61a351d6', 1),
 ('slowinc-93c2de6b40cbbc5e5761d94441f453c0', 0),
 ('slowinc-93c2de6b40cbbc5e5761d94441f453c0', 1),
 ('sum-9b460a9046234581b7a1aefca7bf50e3', 0),
 ('sum-9b460a9046234581b7a1aefca7bf50e3', 1),
 ('sum-aggregate-607cf5b2f5dafd05964459fea8a79ab8',)]

And we have a list of futures like the following:

{"('slowinc-93c2de6b40cbbc5e5761d94441f453c0', 0)": <FutureState: finished>, "('slowinc-93c2de6b40cbbc5e5761d94441f453c0', 1)": <FutureState: finished>}

The _optimize_insert_futures function iterates over the list of keys in the graph and checks if we can replace the values of that graph with futures only if there are overlapping keys with future results – essentially, don’t recompute things. However, because we pre-optimized the list of keys looked like the following

[('sum-aggregate-607cf5b2f5dafd05964459fea8a79ab8',),
 ('sum-9b460a9046234581b7a1aefca7bf50e3', 0),
 ('sum-9b460a9046234581b7a1aefca7bf50e3', 1),
 ('ones-sum-9b460a9046234581b7a1aefca7bf50e3', 1),
 ('ones-sum-9b460a9046234581b7a1aefca7bf50e3', 0)]

and the test broke. PR fixed the issue https://github.com/dask/dask/pull/6409 by removing the the premature optimization call

0reactions

jakirkhamcommented, Jul 14, 2020

Thanks Ben! 😄