Exception when trying to parse on Windows 10: AttributeError: module 'os' has no attribute 'setsid'
See original GitHub issueI’m using tika 1.23 successfully on Python 3.7.4 on one Windows 10 machine. However I installed tika 1.23.1 (the latest version) on another Windows 10 machine running Python 3.8.1, and I get an exception when I try to parse files. For example tika.parser.from_file("PATH_TO_MY_PDF_FILE.pdf")
results in this exception: AttributeError: module 'os' has no attribute 'setsid'.
(NOTE: I am initializing the VM before making this call).
I dug into the tika source code, and found the offending line of code in tika.py:
666: TikaServerProcess = Popen(cmd_string, stdout=logFile, stderr=STDOUT, shell=True, preexec_fn=os.setsid)
The offending line references os.setsid
, but setsid
does not exist in the os
module on Windows per the docs (quoted below):
https://docs.python.org/3.8/library/os.html
os.setsid()
Call the system call setsid(). See the Unix manual for the semantics. Availability: Unix.
I searched through the tika commit history on GitHub and found that this issue was introduced in this commit: https://github.com/chrismattmann/tika-python/blob/431f024d9f0862599421c27afec9076ecf29c2c3/tika/tika.py.
Prior to the aforementioned commit, the line of code in question looked like this, with no reference to os.setsid
:
665: cmd = Popen(cmd_string, stdout=logFile, stderr=STDOUT, shell=True)
Here’s the diff that shows where the issue was introduced: https://github.com/chrismattmann/tika-python/commit/431f024d9f0862599421c27afec9076ecf29c2c3#diff-79bb8c4ed90a3c7e927d1091e49a6680
This issue is preventing me from using the current version of tika on Windows. I’m going to have to downgrade to version 1.23 until this is fixed.
Issue Analytics
- State:
- Created 4 years ago
- Reactions:3
- Comments:11 (5 by maintainers)
Top GitHub Comments
I installed tika through Anaconda today and I am getting the
AttributeError: module 'os' has no attribute 'setsid'.
exception. I’m on Python 3.6 on Windows 10.we have a fix for this in #280 I’ll be applying it shortly. Thank you. I can push a 1.23.2 this week to release it.