question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unknown GhostScript error after successful camelot installation

See original GitHub issue

Hello, I am currently on an Amazon EC2 Linux machine and have installed camelot through Anaconda with conda install -c conda-forge camelot-py. The installation happened without any issues. I could see Ghostscript as part of the dependencies being installed through Anaconda.

Afterwards, I attempted to extract the table from the example foo.pdf. From the documentation, this should be a simple tables = camelot.read_pdf('foo.pdf'). However, immediately after running that command, I received the following long error.

---------------------------------------------------------------------------
GhostscriptError                          Traceback (most recent call last)
<ipython-input-8-6d588ec94ca5> in <module>
----> 1 tables = camelot.read_pdf('./PDFs/foo.pdf')

/usr/local/.../lib/python3.8/site-packages/camelot/io.py in read_pdf(filepath, pages, password, flavor, suppress_stdout, layout_kwargs, **kwargs)
    111         p = PDFHandler(filepath, pages=pages, password=password)
    112         kwargs = remove_extra(kwargs, flavor=flavor)
--> 113         tables = p.parse(
    114             flavor=flavor,
    115             suppress_stdout=suppress_stdout,

/usr/local/.../lib/python3.8/site-packages/camelot/handlers.py in parse(self, flavor, suppress_stdout, layout_kwargs, **kwargs)
    169             parser = Lattice(**kwargs) if flavor == "lattice" else Stream(**kwargs)
    170             for p in pages:
--> 171                 t = parser.extract_tables(
    172                     p, suppress_stdout=suppress_stdout, layout_kwargs=layout_kwargs
    173                 )

/usr/local/.../lib/python3.8/site-packages/camelot/parsers/lattice.py in extract_tables(self, filename, suppress_stdout, layout_kwargs)
    400             return []
    401 
--> 402         self._generate_image()
    403         self._generate_table_bbox()
    404 

/usr/local/.../lib/python3.8/site-packages/camelot/parsers/lattice.py in _generate_image(self)
    217         gs_call = gs_call.encode().split()
    218         null = open(os.devnull, "wb")
--> 219         with Ghostscript(*gs_call, stdout=null) as gs:
    220             pass
    221         null.close()

/usr/local/.../lib/python3.8/site-packages/camelot/ext/ghostscript/__init__.py in Ghostscript(*args, **kwargs)
     88     if __instance__ is None:
     89         __instance__ = gs.new_instance()
---> 90     return __Ghostscript(
     91         __instance__,
     92         args,

/usr/local/.../lib/python3.8/site-packages/camelot/ext/ghostscript/__init__.py in __init__(self, instance, args, stdin, stdout, stderr)
     37         if stdin or stdout or stderr:
     38             self.set_stdio(stdin, stdout, stderr)
---> 39         rc = gs.init_with_args(instance, args)
     40         self._initialized = True
     41         if rc == gs.e_Quit:

/usr/local/.../lib/python3.8/site-packages/camelot/ext/ghostscript/_gsprint.py in init_with_args(instance, argv)
    172     rc = libgs.gsapi_init_with_args(instance, len(argv), c_argv)
    173     if rc not in (0, e_Quit, e_Info):
--> 174         raise GhostscriptError(rc)
    175     return rc
    176 

GhostscriptError: -770376232

That number at the end appears to be change every time I run the command. It stays in that general area of -700 million. The error was when I was in a Jupyter Notebook. Running this while on the pure command line simply prints out a Segmentation Fault. Downgrading the Python version from 3.8 to 3.6 did not fix the issue.

I tried to see if this was a GhostScript problem, but running

gs -sDEVICE=txtwrite -o extractedText.txt ./PDFs/foo.pdf

worked as intended and I could see the text document all nicely formatted. I am unsure as to what the problem could be at this point. Any help is appreciated.

Thanks!

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:12 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
vinayak-mehtacommented, Jul 11, 2021

Meanwhile, can you try installing the latest version with pip install "camelot-py[base]==0.10.1" and then trying out the poppler image conversion backend? Here’s a snippet:

import camelot
tables = camelot.read_pdf("https://camelot-py.readthedocs.io/en/master/_static/pdf/foo.pdf", backend="poppler")
tables[0]
# <Table shape=(7, 7)>

More info in the docs here: https://camelot-py.readthedocs.io/en/master/user/advanced.html#use-alternate-image-conversion-backends

1reaction
vinayak-mehtacommented, Jun 14, 2021

Yep only the lattice flavor uses ghostscript. I’ll have to figure out a way to reproduce this issue.

Read more comments on GitHub >

github_iconTop Results From Across the Web

'Please make sure that Ghostscript is installed' error when it is ...
I am able to successfully run lattice on a PDF on Windows inside a conda environment (Python 3.6) with Camelot installed through pip....
Read more >
Python Camelot / Ghostscript "wrong architecture" error
I ran into a similar issue, and managed to sort of solve it. My problem is that I was using an x86 version...
Read more >
Camelot-Python Ghostscript is not installed Issue Resolved
Ghostscript Installation Link-https://www. ghostscript.com/releases/gsdnld.html.
Read more >
camelot python;OSError: exception: access violation writing ...
I stopped getting this error after I re-installed Camelot from source: git clone https://www.github.com/camelot-dev/camelot cd camelot pip install ".[cv]" ...
Read more >
How do I install GMT (6.1.1) on my Windows, using cygwin
First, I have installed the gsview on my laptop! The reason for the previous error was that I had downloaded the recent version...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found