question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

xarray.DataArray map_blocks failed to deserialize

See original GitHub issue

What happened

My xarray map_blocks call failed in pickling with a deserialization error:

distributed.protocol.core - CRITICAL - Failed to deserialize
...
TypeError: __init__() missing 1 required positional argument: 'code'

What you expected to happen:

The code should work (and it works if you close the client and just use Dask, not distributed).

Not-Quite-Minimal but Complete and Verifiable Example

import xarray as xr
from cartopy import crs as ccrs
import numpy as np
from dask.distributed import Client

client = Client()

nx = 10000
ny = 10000

x = (np.linspace(121940., 574180., nx))
y = (np.linspace(4250700., 4659150., ny))[::-1]

crs_from = ccrs.epsg(26917)

da = xr.DataArray(
    data=np.ones((ny,nx)),
    dims=["y", "x"],
    coords=dict(
        x=(["x"], x),
        y=(["y"], y))).chunk({'x':5120, 'y':5120})

crs_to = ccrs.PlateCarree()  

def xy_to_lonlat(da):
    x, y = np.meshgrid(da.x, da.y)
    ll = crs_to.transform_points(crs_from, x, y)
    lon = ll[:,:,0]
    lat = ll[:,:,1]
    da = da.assign_coords(dict(
        lon=(["y", "x"], lon),
        lat=(["y", "x"], lat)))
    return da

da2 = da.map_blocks(xy_to_lonlat).compute()

Anything else we need to know?: May be related to https://github.com/dask/dask/issues/8355 and/or https://github.com/dask/distributed/issues/5495 ?

Environment:

  • Dask version: 2021.11.0
  • Python version: 3.8.10
  • Operating System: Linux
  • Install method (conda, pip, source): conda

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
rsignell-usgscommented, Dec 9, 2021

Brilliant! Gave me a fish and taught me a little more about fishing! Thanks @ian-r-rose and @jrbourbeau !

1reaction
jrbourbeaucommented, Dec 8, 2021

@rsignell-usgs since crs_from is only used inside the xy_to_lonlat function, you could its definition inside xy_to_lonlat to help with pickling. I was able to run the following code snippet on my laptop successfully:

import xarray as xr
from cartopy import crs as ccrs
import numpy as np
from dask.distributed import Client

if __name__ == "__main__":

    client = Client()

    nx = 10000
    ny = 10000

    x = (np.linspace(121940., 574180., nx))
    y = (np.linspace(4250700., 4659150., ny))[::-1]

    da = xr.DataArray(
        data=np.ones((ny,nx)),
        dims=["y", "x"],
        coords=dict(
            x=(["x"], x),
            y=(["y"], y))).chunk({'x':5120, 'y':5120})

    crs_to = ccrs.PlateCarree()  

    def xy_to_lonlat(da):
        x, y = np.meshgrid(da.x, da.y)
        crs_from = ccrs.epsg(26917)
        ll = crs_to.transform_points(crs_from, x, y)
        lon = ll[:,:,0]
        lat = ll[:,:,1]
        da = da.assign_coords(dict(
            lon=(["y", "x"], lon),
            lat=(["y", "x"], lat)))
        return da

    da2 = da.map_blocks(xy_to_lonlat).compute()
Read more comments on GitHub >

github_iconTop Results From Across the Web

Xarray Distributed Failed to serialize - dask - Stack Overflow
I believe that there are two things that cause this example to crash, both likely related to memory usage.
Read more >
Zarr datasets fail to deserialize when returned by dask #416
I am running computations on a remote dask cluster and attempting to return the result to my local machine. This works when I...
Read more >
xarray.DataArray.map_blocks
This function is designed for when func needs to manipulate a whole xarray object subset to each block. Each block is loaded into...
Read more >
Serialization and IO - xarray - Read the Docs
Data is always loaded lazily from netCDF files. You can manipulate, slice and subset Dataset and DataArray objects, and no array values are...
Read more >
xarray.DataArray.map_blocks
xarray.DataArray.map_blocks¶. DataArray. map_blocks (self, func: 'Callable[..., T_DSorDA]', args: Sequence[Any] = (), kwargs: Mapping[str, ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found