Index type is lost when filtering to empty geo data frame
See original GitHub issueDescription
When filtering a GeoDataFrame with a condition that returns no True
value, so that the resulting dataset is empty, the type of the index is lost.
Test
import pandas as pd
import geopandas as gpd
from datetime import timedelta, datetime
from shapely.geometry import Point
df = pd.DataFrame(dict(geometry=[Point(i, i) for i in range(3)], timestamp=[datetime.now() + timedelta(seconds=i) for i in range(3)])).set_index("timestamp")
gdf = gpd.GeoDataFrame(df)
max_date = df.index.max()
df_filtered = df.loc[df.index > max_date, :]
gdf_filtered = gdf.loc[gdf.index > max_date, :]
print(df_filtered.index)
print(gdf_filtered.index)
0.5.1
Output
DatetimeIndex([], dtype='datetime64[ns]', name='timestamp', freq=None)
DatetimeIndex([], dtype='datetime64[ns]', name='timestamp', freq=None)
Environment
attrs==19.3.0
Click==7.0
click-plugins==1.1.1
cligj==0.5.0
Fiona==1.8.9.post2
geopandas==0.5.1
munch==2.5.0
numpy==1.17.3
pandas==0.25.3
pyproj==2.4.0
python-dateutil==2.8.1
pytz==2019.3
Shapely==1.6.4.post2
six==1.12.0
0.6.1
Output
DatetimeIndex([], dtype='datetime64[ns]', name='timestamp', freq=None)
RangeIndex(start=0, stop=0, step=1)
Environment
attrs==19.3.0
Click==7.0
click-plugins==1.1.1
cligj==0.5.0
Fiona==1.8.9.post2
geopandas==0.6.1
munch==2.5.0
numpy==1.17.3
pandas==0.25.3
pyproj==2.4.0
python-dateutil==2.8.1
pytz==2019.3
Shapely==1.6.4.post2
six==1.12.0
Issue Analytics
- State:
- Created 4 years ago
- Comments:5 (4 by maintainers)
Top Results From Across the Web
Missing and empty geometries - GeoPandas
The scalar object (when accessing a single element of a GeoSeries) is still a Shapely geometry object. Missing geometries are unknown values in...
Read more >Filtering GeoDataFrame rows with list of strings in GeoPandas
Another approach utilizes the .query() , that was also mentioned in the Use a list of values to select rows from a Pandas...
Read more >Geo Pandas Data Frame / Matrix - filter/drop NaN / False values
A possible solution to get to the dataframe of matching indexes: ... You can then of course delete the column with trues: .drop(0,...
Read more >Working with Missing Data in Pandas - GeeksforGeeks
In order to check missing values in Pandas DataFrame, we use a function isnull() and notnull(). Both function help in checking whether a ......
Read more >Revision - 8932b3c - BUG: return empty gdf for empty result of clip ...
also fixes the repr of an empty or all-NA GeoSeries (#1184, #1195). - Fix filtering of a GeoDataFrame to preserve the index type...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@simon-keith thanks a lot for the clear, reproducible bug report!
I can reproduce this on 0.6.1, but not with master. So something might have fixed this already. In which case, we should add a test and could backport it to 0.6.x
OK, I found the change that explains why it fails with released pandas but works with pandas master.
Pandas 0.25:
With pandas master:
So assigning to a column of an empty frame resets the index on released pandas, but this has been fixed to preserve the index on pandas master.
And with GeoPandas 0.6, we started to rely on this pattern to write the geometries into the array (and thus also do this for empty geodataframes), hence introducing this regression for released pandas versions.