question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Improve user experience for Windows 10

See original GitHub issue

Hi

Describe the issue I’ve managed to run OCRmyPDF.exe on Windows 10 without wsl.

To Reproduce I’ve made fork and added some quick fixes in this commit: https://github.com/dibu28/OCRmyPDF/commit/543088e79e8649e968d02d8fd268123255607dc1

Fixes are:

  1. in leptonica.py librray name is liblept-5 instead of lept
  2. in ghostscript.py 2.1) executable name is gswin64c.exe instead of gs 2.2) NamedTemporaryFile doesnt work properly and gs could not modify tmp file with access denied error. (so as a temporary workaround I’m adding “_1” to temp file name and then removing file. There could be some better way)
  3. in _pipeline.py and helpers.py files - symlinking to temp folder on windows requires Admin privelegies. So instead of simlinking I’m just copying files.
  4. in _sync.py file - os.path.samefile is returning error: “OSError: [WinError 1] Incorrect function: ‘nul’”

So after those changes and installin dependencies it started to work from command line like this: OCRmyPDF.exe input.pdf output.pdf

Dependencies and binaries I’m using: https://www.python.org/ftp/python/3.7.5/python-3.7.5-amd64.exe https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-w64-setup-v5.0.0-alpha.20191030.exe https://github.com/ArtifexSoftware/ghostpdl-downloads/releases/download/gs950/gs950w64.exe https://github.com/qpdf/qpdf/releases/download/release-qpdf-9.0.2/qpdf-9.0.2-bin-msvc64.zip

Add paths to PATH variable: set PATH=%PATH%;C:\Program Files\Tesseract-OCR; set PATH=%PATH%;C:\Program Files\gs\gs9.50\bin; set PATH=%PATH%;C:\qpdf\qpdf-9.0.2-bin-msvc64\qpdf-9.0.2\bin;

python setup.py build
OCRmyPDF.exe input.pdf output.pdf

Expected behavior Can we add some workarounds using conditions based on os type?

System:

  • OS: Windows 10
  • OCRmyPDF Version: v9.0.5

Additional context

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:57

github_iconTop GitHub Comments

2reactions
insanepopeyecommented, Aug 20, 2020

import ocrmypdf Traceback (most recent call last):

File “<ipython-input-1-a81f3474d7ad>”, line 1, in <module> import ocrmypdf

File “C:\Users\22252\AppData\Roaming\Python\Python38\site-packages\ocrmypdf_init_.py”, line 10, in <module> from ocrmypdf import helpers, hocrtransform, leptonica, pdfa, pdfinfo

File “C:\Users\22252\AppData\Roaming\Python\Python38\site-packages\ocrmypdf\leptonica.py”, line 62, in <module> lept = ffi.dlopen(_libpath) OSError: cannot load library ‘D:\OCR\Tesseract-OCR\liblept-5.dll’: error 0x7f

Please let me know how to fix this ??

1reaction
jbarlow83commented, Dec 19, 2019

The first step will be for ocrmypdf to check in reasonable locations for Tesseract and GS, examining the registry or whatever, so PATH becomes an override.

I don’t believe I can bundle the GS installer unless I change OCRmyPDF to AGPL, and I’m not sure I want to do that. I believe everything else could be bundled.

As far as actually doing a Windows installer, bundling, or setting up a choco package, I am hoping the community will step up, because I haven’t done made a Windows installer before or tried to package a Python application for Windows, and other people probably know how to get this off the ground faster than I can even if I end up finishing it. I converted to Azure Pipelines for its better Windows support, so that ideally we can test and deploy for every distribution type in one shot.

ocrmypdf is a unique/more complex case in its use of Leptonica (ABI level binding to a C library) and relies on calls to third party non-Python binaries. It will probably be necessary to spin off Leptonica into a separate package that gets compiled as a binary wheel, something I’ve already started work on actually. That means installer-generator programs that try to inspect the source code for dependencies are probably going to fail, because usually look for Python-only dependencies.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Why User experience of windows 10 is getting bad? - UX Planet
Currently, Windows 10 is loaded with 3D apps like Paint 3D, 3D viewer, and Print 3D which are not needed for a regular...
Read more >
Windows 10 and its User Experience
My Experience with Windows' User Experience. As a User Experience Strategist at User Insight, when I think about “Windows” and “User Experience” ...
Read more >
Enhance the user interface of your Windows 10 app
Learn how to add navigation and media to the UI of your Windows 10 app to enhance the user experience. Learning objectives. In...
Read more >
Top 10 Ways to Improve End-User Experience | Synoptek
Top 10 Ways to Improve End-User Experience · 10. Make Sure the User is Entertained · 9. Employ Flat Design · 8. Eliminate...
Read more >
22 Best UX Tools & Software to Perfect the User Experience
The 22 best UX tools to improve the user experience (and how to use them) ... Lucidchart works across all devices, supports Windows,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found