question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: `difference` hangs/crashes in some case

See original GitHub issue

Hello,

@LoicGoulefert and I have been tracking a weird hang with pygeos.set_operations.difference which I could reduce to the following test case:

Setup

macOS 10.15.6 Python version 3.8.5 from MacPort Clean venv with:

$ pip freeze
numpy==1.19.2
pygeos==0.8
$ port installed | grep geos
  geos @3.8.1_0 (active)

Code:

import pickle

import numpy as np
import pygeos

with open("lines.pickle", "rb") as fp:
    lines = pickle.load(fp)

with open("coords.npy", "rb") as fp:
    coords = np.load(fp)

line_arr = np.array([pygeos.linestrings(l) for l in lines])
p = pygeos.polygons(coords)

print(f"All lines valid: {np.all(pygeos.is_valid(line_arr))}")
print(f"Polygon valid: {pygeos.is_valid(p)}")
print("Starting...")
pygeos.set_operations.difference(line_arr, p)
print("Done...")

Here is a zip containing this script, the data files, and a script to display the datafile’s content: difference_bug.zip

Here is what the data looks like: image

In red, about 100 line strings loaded in line_arr. In blue, the polygon p.

Expected

Code runs quickly.

Actual

pygeos.set_operations.difference() never returns (or takes a long, long time), with CPU running at 100%.

$ python bug_repro.py 
All lines valid: True
Polygon valid: True
Starting...
^C^C^C^C^C^C^CKilled: 9

(finally killed from the process manager)

Any idea on what is going on?

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:11 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
dr-jtscommented, Sep 21, 2020

Another note: it is often faster to test whether two geometries intersect before computing their difference (or intersection), since predicates are usually faster than overlay. This is particularly true in this case. When computing piecewise difference, the performance is 55 s without a preliminary intersection test, but only 124 ms with one!

This test is not included as part of the overlay algorithm for the following reasons:

  • it is relatively expensive to perform this check in all situations, so it is left up to the client to determine if it is advantageous to do so
  • a client might use spatial indexing methods to filter out non-intersecting geometries
  • using an intersects filter means that the overlay results might not fulfill the overlay contract of providing fully-noded resulits (since when using a filter the original input value might be returned unchanged).
0reactions
abey79commented, Sep 22, 2020

That makes sense, and pygeos handles this beautifully:

idx = pygeos.intersects(line_arr, p)
line_arr[idx] = pygeos.difference(line_arr[idx], p)
Read more comments on GitHub >

github_iconTop Results From Across the Web

Crashes, Freezes and Hangs – What's the Difference?
A system that is “frozen” is one that is completely unresponsive and unusable. The keyboard does not function, the mouse cursor will not...
Read more >
c++ - Difference between a program that crashes and program ...
Crashing is normally caused by an illegal instruction, e.g. accessing invalid memory, dividing by zero, etc. Usually this manifests itself ...
Read more >
Visual Studio Constantly Hangs and Freezes or Restarts Itself ...
The crash/restarts don't come with a hang or any warning at all. The program just completely and instantly closes itself with no warning...
Read more >
visual studio 2022 hangs randomly - Visual Studio Feedback
The symptom is that the whole Visual Studio window stops responding and the operating system emits an “error sound” whenever I try to...
Read more >
What is the difference between crashes, hangs and freezes?
A hang is when a software program becomes unresponsive. If you open the Windows task manager, the program still displays in the task...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found