Error when attempting to `.add_data(gdf)` twice [Bug]
See original GitHub issueDescribe the bug
In Jupyter Notebook, after initializing a map and adding a geopandas dataframe with .add_data
, any further attempt to plot that same dataset errors without reloading the source data.
To Reproduce Steps to reproduce the behavior:
- Initialize map
- Load geopandas dataset
- Plot dataset
- Initialize new map
- Plot dataset
Here’s a code example that uses a random shapefile I had handy (will work with any dataset loaded from gpd.read_file
)
roads = gpd.read_file('./roads/tl_2016_us_primaryroads.shp').to_crs(epsg=4326).head(10)
m = keplergl.KeplerGl()
m.add_data(roads)
display(m)
m = keplergl.KeplerGl()
m.add_data(roads)
m
Expected behavior I should get a new map with the data plotted
Screenshots
Desktop (please complete the following information):
- OS: Arch Linux
- Browser Firefox
- Version 0.2.1
Additional context I think what’s going on is that keplergl is modifying the underlying dataset and then isn’t catching that it’s already been done the second time.
This example works, but requires copying the dataset
roads = gpd.read_file('./roads/tl_2016_us_primaryroads.shp').to_crs(epsg=4326).head(10)
m = keplergl.KeplerGl()
m.add_data(roads.copy())
display(m)
m = keplergl.KeplerGl()
m.add_data(roads.copy())
m
Here’s the traceback of the error that shows up:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-22-646e2e57d311> in <module>
4 display(m)
5 m = keplergl.KeplerGl()
----> 6 m.add_data(roads)
7 m
~/.local/lib/python3.8/site-packages/keplergl/keplergl.py in add_data(self, data, name)
130 '''
131
--> 132 normalized = _normalize_data(data)
133 copy = self.data.copy()
134 copy.update({name: normalized})
~/.local/lib/python3.8/site-packages/keplergl/keplergl.py in _normalize_data(data)
44 def _normalize_data(data):
45 if isinstance(data, pd.DataFrame):
---> 46 return _gdf_to_dict(data) if isinstance(data, geopandas.GeoDataFrame) else _df_to_dict(data)
47 return data
48
~/.local/lib/python3.8/site-packages/keplergl/keplergl.py in _gdf_to_dict(gdf)
38 df = pd.DataFrame(gdf)
39 # convert geometry to wkt
---> 40 df[name] = df.geometry.apply(lambda x: shapely.wkt.dumps(x))
41
42 return _df_to_dict(df)
~/.local/lib/python3.8/site-packages/pandas/core/series.py in apply(self, func, convert_dtype, args, **kwds)
3846 else:
3847 values = self.astype(object).values
-> 3848 mapped = lib.map_infer(values, f, convert=convert_dtype)
3849
3850 if len(mapped) and isinstance(mapped[0], Series):
pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()
~/.local/lib/python3.8/site-packages/keplergl/keplergl.py in <lambda>(x)
38 df = pd.DataFrame(gdf)
39 # convert geometry to wkt
---> 40 df[name] = df.geometry.apply(lambda x: shapely.wkt.dumps(x))
41
42 return _df_to_dict(df)
~/.local/lib/python3.8/site-packages/shapely/wkt.py in dumps(ob, trim, **kw)
20 See available keyword output settings in ``shapely.geos.WKTWriter``.
21 """
---> 22 return geos.WKTWriter(geos.lgeos, trim=trim, **kw).write(ob)
23
24 def dump(ob, fp, **settings):
~/.local/lib/python3.8/site-packages/shapely/geos.py in write(self, geom)
391 def write(self, geom):
392 """Returns WKT string for geometry"""
--> 393 if geom is None or geom._geom is None:
394 raise ValueError("Null geometry supports no operations")
395 result = self._lgeos.GEOSWKTWriter_write(self._writer, geom._geom)
AttributeError: 'str' object has no attribute '_geom'
Issue Analytics
- State:
- Created 3 years ago
- Reactions:2
- Comments:5
Top GitHub Comments
Is there any update on this? I’ve discovered (after some hair pulling) that this bug also impacts
df.to_json
anddf.to_file(driver='GeoJSON',...)
and subsequently errors.The only solution is to pass a copy of the dataframe to Kepler instead of the original (sometimes infeasible for large datasets).
I think I’ve discovered the cause here. My first hint was that I noticed that calling the GeoDataframe constructor on a pandas dataframe, modifies the original dataframe (by adding a geometry column Geopandas #1179). Apparently this behaviour also exists in pandas: if you call the Dataframe constructor on another dataframe, pandas will simply create a view, not a copy so changes made on the new dataframe will propagate to the old dataframe.
Knowing this it would suggest that the problem is coming from these lines of code
The code is using the dataframe constructor to convert a geodataframe to a dataframe. It then makes changes to the dataframe using shapely.wkt.dumps(x) to serialize the geometry. However modifying df will also change gdf, since df is just a view of gdf.
Indeed this can be verified easily. Prior to passing the your geodataframe, gdf, to kepler, run gdf.dtypes to inspect the data types. You should see that the geometry column has a geometry dtype, as expected.
After you add the data to kepler, inspect gdf.dtypes again and you’ll see this:
The geometry column now has an object dtype! Interestingly if you call gdf.geometry directly, it may still show up as a geometry dtype; I suspect this is GeoPandas enforcing a type on this column. However, if you now convert the gdf to a df using pd.DataFrame(gdf), as is done in the code, the geometry column on the new df will indeed be cast as an object type. shapely.wkt.dumps(x) operates on geometries not strings, so the error _‘str’ object has no attribute 'geom’ is thrown.
To avoid this problem you could pass a copy of the data as was already mentioned or pass a GeoJson object using gdf.to_json()