Unknown GhostScript error after successful camelot installation
See original GitHub issueHello,
I am currently on an Amazon EC2 Linux machine and have installed camelot
through Anaconda with conda install -c conda-forge camelot-py
. The installation happened without any issues. I could see Ghostscript as part of the dependencies being installed through Anaconda.
Afterwards, I attempted to extract the table from the example foo.pdf. From the documentation, this should be a simple tables = camelot.read_pdf('foo.pdf')
. However, immediately after running that command, I received the following long error.
---------------------------------------------------------------------------
GhostscriptError Traceback (most recent call last)
<ipython-input-8-6d588ec94ca5> in <module>
----> 1 tables = camelot.read_pdf('./PDFs/foo.pdf')
/usr/local/.../lib/python3.8/site-packages/camelot/io.py in read_pdf(filepath, pages, password, flavor, suppress_stdout, layout_kwargs, **kwargs)
111 p = PDFHandler(filepath, pages=pages, password=password)
112 kwargs = remove_extra(kwargs, flavor=flavor)
--> 113 tables = p.parse(
114 flavor=flavor,
115 suppress_stdout=suppress_stdout,
/usr/local/.../lib/python3.8/site-packages/camelot/handlers.py in parse(self, flavor, suppress_stdout, layout_kwargs, **kwargs)
169 parser = Lattice(**kwargs) if flavor == "lattice" else Stream(**kwargs)
170 for p in pages:
--> 171 t = parser.extract_tables(
172 p, suppress_stdout=suppress_stdout, layout_kwargs=layout_kwargs
173 )
/usr/local/.../lib/python3.8/site-packages/camelot/parsers/lattice.py in extract_tables(self, filename, suppress_stdout, layout_kwargs)
400 return []
401
--> 402 self._generate_image()
403 self._generate_table_bbox()
404
/usr/local/.../lib/python3.8/site-packages/camelot/parsers/lattice.py in _generate_image(self)
217 gs_call = gs_call.encode().split()
218 null = open(os.devnull, "wb")
--> 219 with Ghostscript(*gs_call, stdout=null) as gs:
220 pass
221 null.close()
/usr/local/.../lib/python3.8/site-packages/camelot/ext/ghostscript/__init__.py in Ghostscript(*args, **kwargs)
88 if __instance__ is None:
89 __instance__ = gs.new_instance()
---> 90 return __Ghostscript(
91 __instance__,
92 args,
/usr/local/.../lib/python3.8/site-packages/camelot/ext/ghostscript/__init__.py in __init__(self, instance, args, stdin, stdout, stderr)
37 if stdin or stdout or stderr:
38 self.set_stdio(stdin, stdout, stderr)
---> 39 rc = gs.init_with_args(instance, args)
40 self._initialized = True
41 if rc == gs.e_Quit:
/usr/local/.../lib/python3.8/site-packages/camelot/ext/ghostscript/_gsprint.py in init_with_args(instance, argv)
172 rc = libgs.gsapi_init_with_args(instance, len(argv), c_argv)
173 if rc not in (0, e_Quit, e_Info):
--> 174 raise GhostscriptError(rc)
175 return rc
176
GhostscriptError: -770376232
That number at the end appears to be change every time I run the command. It stays in that general area of -700 million. The error was when I was in a Jupyter Notebook. Running this while on the pure command line simply prints out a Segmentation Fault
. Downgrading the Python version from 3.8 to 3.6 did not fix the issue.
I tried to see if this was a GhostScript problem, but running
gs -sDEVICE=txtwrite -o extractedText.txt ./PDFs/foo.pdf
worked as intended and I could see the text document all nicely formatted. I am unsure as to what the problem could be at this point. Any help is appreciated.
Thanks!
Issue Analytics
- State:
- Created 2 years ago
- Comments:12 (5 by maintainers)
Top GitHub Comments
Meanwhile, can you try installing the latest version with
pip install "camelot-py[base]==0.10.1"
and then trying out thepoppler
image conversion backend? Here’s a snippet:More info in the docs here: https://camelot-py.readthedocs.io/en/master/user/advanced.html#use-alternate-image-conversion-backends
Yep only the lattice flavor uses ghostscript. I’ll have to figure out a way to reproduce this issue.