question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Exception when trying to parse on Windows 10: AttributeError: module 'os' has no attribute 'setsid'

See original GitHub issue

I’m using tika 1.23 successfully on Python 3.7.4 on one Windows 10 machine. However I installed tika 1.23.1 (the latest version) on another Windows 10 machine running Python 3.8.1, and I get an exception when I try to parse files. For example tika.parser.from_file("PATH_TO_MY_PDF_FILE.pdf") results in this exception: AttributeError: module 'os' has no attribute 'setsid'. (NOTE: I am initializing the VM before making this call).

I dug into the tika source code, and found the offending line of code in tika.py:

666: TikaServerProcess = Popen(cmd_string, stdout=logFile, stderr=STDOUT, shell=True, preexec_fn=os.setsid)

The offending line references os.setsid, but setsid does not exist in the os module on Windows per the docs (quoted below):

https://docs.python.org/3.8/library/os.html

os.setsid()

Call the system call setsid(). See the Unix manual for the semantics.

Availability: Unix.

I searched through the tika commit history on GitHub and found that this issue was introduced in this commit: https://github.com/chrismattmann/tika-python/blob/431f024d9f0862599421c27afec9076ecf29c2c3/tika/tika.py.

Prior to the aforementioned commit, the line of code in question looked like this, with no reference to os.setsid:

665: cmd = Popen(cmd_string, stdout=logFile, stderr=STDOUT, shell=True)

Here’s the diff that shows where the issue was introduced: https://github.com/chrismattmann/tika-python/commit/431f024d9f0862599421c27afec9076ecf29c2c3#diff-79bb8c4ed90a3c7e927d1091e49a6680

This issue is preventing me from using the current version of tika on Windows. I’m going to have to downgrade to version 1.23 until this is fixed.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:3
  • Comments:11 (5 by maintainers)

github_iconTop GitHub Comments

4reactions
lgtateoscommented, Feb 10, 2020

I installed tika through Anaconda today and I am getting the AttributeError: module 'os' has no attribute 'setsid'. exception. I’m on Python 3.6 on Windows 10.

1reaction
chrismattmanncommented, Feb 4, 2020

we have a fix for this in #280 I’ll be applying it shortly. Thank you. I can push a 1.23.2 this week to release it.

Read more comments on GitHub >

github_iconTop Results From Across the Web

AttributeError: module 'os' has no attribute 'setsid'
I am trying to connect to MongoDB using Anaconda3 ...
Read more >
'module' object has no attribute 'setsid' on windows
I have a build that works fine on build 3157, but breaks on 3158. { "shell" : true, "cmd" : [ "C:/ProgramData/Miniconda3/python.exe", "-u", ......
Read more >
[Solved] AttributeError: 'module' object has no attribute
Click here to subscribe - https://www.youtube.com/channel/UCeVMnSShP_Iviwkknt83cww▻Instagram ...
Read more >
Changelog — Python 3.11.1 documentation
gh-96848: Fix command line parsing: reject -X int_max_str_digits option with no value (invalid) when the PYTHONINTMAXSTRDIGITS environment ...
Read more >
Release notes for Python 3.5.4
bpo-30301: Fix AttributeError when using SimpleQueue.empty() under *spawn* and ... bpo-28732: Fix crash in os.spawnv() with no elements in args - bpo-28485: ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found