question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: sqlite read error with ProcessPoolExecutor

See original GitHub issue

This bug seems similar (or exactly the same ?) as #426

Code Sample, a copy-pastable example if possible

I’ve created a poetry environment and Python scripts which reproduce the bug on my system. Due to the parallel nature it might not get reproduced on every system (?).

git clone https://github.com/nialov/pyproj-multiprocessing-bug-hunt.git
cd pyproj-multiprocessing-bug-hunt
# Need poetry installed on system
poetry install
# Script with parallel processes and which tries to reproduce bug
poetry run python script_parallel.py
# Sanity check script with sequential processing which doesn't error.
poetry run python script_parallel.py

Problem description

pyproj 3.2.0 errors when reading its sqlite file in parallel using Python concurrent.futures.ProcessPoolExecutor. I assume any method to create parallel processes in Python will recreate this.

This bug occurred with pyproj 3.2.0 and is not present with pyproj 3.1.0.

Error message:

➜ pr python script_parallel.py
concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
  File "/usr/lib/python3.8/concurrent/futures/process.py", line 239, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
  File "/home/nialov/.cache/pypoetry/virtualenvs/pyproj-multiprocessing-bug-hunt-ovaqiMDF-py3.8/lib/python3.8/site-packages/pyproj/crs/crs.py", line 326, in __init__
    self._local.crs = _CRS(self.srs)
  File "pyproj/_crs.pyx", line 2347, in pyproj._crs._CRS.__init__
pyproj.exceptions.CRSError: Invalid projection: EPSG:3067: (Internal Proj Error: proj_create: SQLite error on SELECT auth_name FROM authority_list: database disk image is malformed)
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "script_parallel.py", line 13, in <module>
    print(process.result())
  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 437, in result
    return self.__get_result()
  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
    raise self._exception
pyproj.exceptions.CRSError: Invalid projection: EPSG:3067: (Internal Proj Error: proj_create: SQLite error on SELECT auth_name FROM authority_list: database disk image is malformed)

Expected Output

Should work in parallel. Added a sequential example script script_sequential.py as sanity check.

Environment Information

➜ pr pyproj -v
pyproj info:
    pyproj: 3.2.0
      PROJ: 8.1.1
  data dir: /home/nialov/.cache/pypoetry/virtualenvs/pyproj-multiprocessing-bug-hunt-ovaqiMDF-py3.8/lib/python3.8/site-packages/pyproj/proj_dir/share/proj
user_data_dir: /home/nialov/.local/share/proj

System:
    python: 3.8.10 (default, Jun  2 2021, 10:49:15)  [GCC 10.3.0]
executable: /home/nialov/.cache/pypoetry/virtualenvs/pyproj-multiprocessing-bug-hunt-ovaqiMDF-py3.8/bin/python
   machine: Linux-4.19.84-microsoft-standard-x86_64-with-glibc2.32

Python deps:
   certifi: 2021.05.30
       pip: 21.1.3
setuptools: 57.4.0
    Cython: None

Installation method

Installed from pypi onto Ubuntu 20.10.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:11 (9 by maintainers)

github_iconTop GitHub Comments

1reaction
snowman2commented, Sep 12, 2021

Thanks @nialov for the report and @rouault for helping to debug & resolve the issue 👍

1reaction
rouaultcommented, Sep 6, 2021

another finding is that whether SQLite3 use pread64() depends on which source distribution you use. If you use the sqlite-autoconf-XXXX builds, their configure doesn’t include pread64() detection. You have to explicitly pass CFLAGS=“-DHAVE_PREAD64 -DHAVE_PWRITE64”. Whereas the sqlite-src-XXXXX.zip distribution automatically detects it…

Read more comments on GitHub >

github_iconTop Results From Across the Web

python - I am having problems with ProcessPoolExecutor from ...
I updated your code to show main being called. This is an issue with spawning operating systems like Windows. To test on my...
Read more >
Result and Error Codes - SQLite
The SQLITE_BUSY result code indicates that the database file could not be written (or in some cases read) because of concurrent activity by...
Read more >
User guide — APScheduler 3.9.1 documentation
an SQLAlchemyJobStore named “default” (using SQLite). a ThreadPoolExecutor named “default”, with a worker count of 20. a ProcessPoolExecutor named “processpool” ...
Read more >
Python: the old bug in SQLite module strikes back
This is a simplified test case to reproduce the issue: import sqlite3 def main(): conn = sqlite3.connect(':memory:') cur = conn.cursor() sql ...
Read more >
SQLite error on "/var/lib/dnf/history.sqlite": Reading a row failed
Bug 1868860 - Error: SQLite error on "/var/lib/dnf/history.sqlite": Reading a row failed: database disk image is malformed". Summary: Error: SQLite error on ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found