Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[clip vs. intersect] polygons lost due to dropping duplicates generated by .overlay()

See original GitHub issue

I was testing the .overlay() function and got some interesting issues. I imported two geojson files as the following: The first one has a shape of (2167, 17), with a OBJECTID column: plot1 The second one has a shape of (5, 5): plot2 I ran .overlay(how='intersection'), and got the following map with a shape of (3195, 21) plot3 The above map looks right, but there are many duplicated rows in the geodataframe. I used the method posted by folks here, and transferred the geometry column to wkb. I then ran .nunique() and saw there are 2165 unique values in OBJECTID, but there are 3195 unique values in the geometry column. I’m confused that why the number is not 2165 and exceeds both geodataframes I imported. Then I dropped those duplicated values based on the OBJECTID column with .drop_duplicates(subset='OBJECTID', inplace=True). I plotted the geodataframe again and got the following map. I realized that some of the polygons have lost. (Btw, I used both geojson files in QGIS before and got an identical map like the one above by .overlay(how='intersection'), and the clipped layer got 2165 rows in the attribute table, so it seems that 2165 is the right number to expect) plot4 I checked GeoPanads’s API and didn’t see any way to prevent those duplicated rows from generating. I don’t know why there are 3195 rows in the geodataframe, and why those polygons are lost by dropping duplicates.

Issue Analytics

State:
Created 5 years ago
Comments:23 (6 by maintainers)

Top GitHub Comments

1reaction

StevenLi-DScommented, Jun 29, 2019

@austinorr I believe you brought this issue to me before. What do you think of https://github.com/geopandas/geopandas/issues/1027 ?

1reaction

StevenLi-DScommented, Sep 25, 2018

@austinorr thank you for mentioning that man. I was assuming that both are the same.

Top Results From Across the Web

GeoPandas Intersection cause duplicate columns

What I try to do is loop through large polygons and find corresponding small polygons. The tool to be used is GeoPandas Overlay...

How prevent Intersect from duplicating records?

Points lying in the portion of overlap will be duplicated in the output because they intersect with 2 (or more) polygon features.

How Intersect works—ArcGIS Pro | Documentation

The Intersect tool calculates the geometric intersection of any number of feature classes and feature layers. The features, or portion of features, ...

Spatial Joins — GeoPandas 0.12.2+0.gefcb367.dirty ...

We duplicate them if necessary to represent multiple hits between the two dataframes. We retain attributes of the right and left only if...

terra.pdf

get or remove the polygon holes. makeNodes create nodes on lines. mergeLines connect lines to form polygons. removeDupNodes remove duplicate ...