question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[clip vs. intersect] polygons lost due to dropping duplicates generated by .overlay()

See original GitHub issue

I was testing the .overlay() function and got some interesting issues. I imported two geojson files as the following: The first one has a shape of (2167, 17), with a OBJECTID column: plot1 The second one has a shape of (5, 5): plot2 I ran .overlay(how='intersection'), and got the following map with a shape of (3195, 21) plot3 The above map looks right, but there are many duplicated rows in the geodataframe. I used the method posted by folks here, and transferred the geometry column to wkb. I then ran .nunique() and saw there are 2165 unique values in OBJECTID, but there are 3195 unique values in the geometry column. I’m confused that why the number is not 2165 and exceeds both geodataframes I imported. Then I dropped those duplicated values based on the OBJECTID column with .drop_duplicates(subset='OBJECTID', inplace=True). I plotted the geodataframe again and got the following map. I realized that some of the polygons have lost. (Btw, I used both geojson files in QGIS before and got an identical map like the one above by .overlay(how='intersection'), and the clipped layer got 2165 rows in the attribute table, so it seems that 2165 is the right number to expect) plot4 I checked GeoPanads’s API and didn’t see any way to prevent those duplicated rows from generating. I don’t know why there are 3195 rows in the geodataframe, and why those polygons are lost by dropping duplicates.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:23 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
StevenLi-DScommented, Jun 29, 2019

@austinorr I believe you brought this issue to me before. What do you think of https://github.com/geopandas/geopandas/issues/1027 ?

1reaction
StevenLi-DScommented, Sep 25, 2018

@austinorr thank you for mentioning that man. I was assuming that both are the same.

Read more comments on GitHub >

github_iconTop Results From Across the Web

GeoPandas Intersection cause duplicate columns
What I try to do is loop through large polygons and find corresponding small polygons. The tool to be used is GeoPandas Overlay...
Read more >
How prevent Intersect from duplicating records?
Points lying in the portion of overlap will be duplicated in the output because they intersect with 2 (or more) polygon features.
Read more >
How Intersect works—ArcGIS Pro | Documentation
The Intersect tool calculates the geometric intersection of any number of feature classes and feature layers. The features, or portion of features, ...
Read more >
Spatial Joins — GeoPandas 0.12.2+0.gefcb367.dirty ...
We duplicate them if necessary to represent multiple hits between the two dataframes. We retain attributes of the right and left only if...
Read more >
terra.pdf
get or remove the polygon holes. makeNodes create nodes on lines. mergeLines connect lines to form polygons. removeDupNodes remove duplicate ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found