question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

A set of failing PDFs

See original GitHub issue

I recently used ocrmypdf to mass-OCR my PDFs and a bunch of DjVu files I converted to PDF (which strips the original Tesseract OCR so I needed some way to restore it). Worked very nicely, and I like the better compression over the default ddjvu output.

Some files failed. I noticed the mention of a test corpus, so I thought you might like a list of failing files (these failed multiple times, so should be reliable test cases) and the errors.

The errors:

myocr-gwernnet-errors.txt

The files:

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:10

github_iconTop GitHub Comments

1reaction
jbarlow83commented, Dec 15, 2018

The problem is quite definitely how these files are formatted. In any case, the next release should be more tolerant of PDFs with these types of errors - it will issue warnings instead.

I went by the logs and concluded the errors are for the same for the most part.

0reactions
jbarlow83commented, Jan 9, 2019

Probably fixed this, or at least suppressed the immediate cause of stack trace, in next release

Read more comments on GitHub >

github_iconTop Results From Across the Web

Adobe Reader failing to open PDFs even after being set as default ...
Hi Tim,. As per the description given above, that Adobe Reader isn't setting up as default PDF viewer and its keep reverting to...
Read more >
Failed PDF Conversion: 5 Common Errors and Fixes - Inkit
#1. Conversion to PDF failed because of mistakes in HTML code · #2. HTML document formatting issues · #3. Rendering software integration mistakes....
Read more >
Checking & Fixing PDFs for Accessibility - unt clear
If the accessibility check results show a "Tagged PDF" status of "Failed," then the PDF lacks tags. To resolve this issue, (1) right-click...
Read more >
Correcting PDF Validation Issues - HCAI
If only a few pages fail validation, you can print those specific pages to PDF and replace them in the original file. To...
Read more >
Troubleshoot when you can't insert a PDF Printout in Class ...
Solution 1: Set Adobe Acrobat Reader as your default PDF viewer · Install Adobe Acrobat Reader. · Open your Start menu, then select...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found