Option `--output-type pdf` causes PDF to be blank (with wrong geometry)
See original GitHub issueWith following PDF file: test.pdf
If you do ocrmypdf --output-type pdf test.pdf foo.pdf
, the file foo.pdf is 10 times smaller than the original, and is blank when you open it. Remove the option --output-type pdf
and the problem is gone.
I noted this:
$ pdfimages foo.pdf img
$ identify -format "%@" img-000.pbm
identify-im6.q16: geometry does not contain image `img-000.pbm' @ warning/attribute.c/GetImageBoundingBox/247.
0x0+2023+2866
The geometry 0x0 is obviously problematic.
$ ocrmypdf --version
9.4.0+dfsg
$ tesseract --version
tesseract 4.1.1
leptonica-1.79.0
libgif 5.1.9 : libjpeg 6b (libjpeg-turbo 1.5.2) : libpng 1.6.37 : libtiff 4.1.0 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.1
Found AVX2
Found AVX
Found FMA
Found SSE
Found libarchive 3.4.0 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.8 liblz4/1.9.1 libzstd/1.4.3
Issue Analytics
- State:
- Created 4 years ago
- Comments:7
Top Results From Across the Web
How to fix a fillable PDF that shows blank fields.
Right click on the PDF. Note: your menu that opens when right clicking may not have all the same options as shown. a....
Read more >Top Methods to Fix Blank PDF Problem - Wondershare Repairit
Part 3: Causes of Blank page PDF If a PDF file is blank when opened, it can be one of the following causes....
Read more >PDF pages go blank - Adobe Support Community - 8352938
1. Open Adobe Reader, go to Edit > Preferences > Internet and either tick, or untick, “Display PDF in browser”. Then close down...
Read more >"No objects were imported" when importing a PDF file into ...
When using PDFIMPORT to import geometry from a PDF file into an AutoCAD drawing, a "No Objects were imported." error message is displayed....
Read more >Changing PDF viewer - Overleaf, Online LaTeX Editor
Occasionally, issues with the Overleaf PDF viewer may cause fonts or TikZ drawings to appear incorrectly, and cause some images not to display...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi I can confirm this issue has been resolved.
Tested with the test pdf provided on ocrmypdf:v13.6.2 and it gives no issues. All perfect crisp OCR with text layer 😉
Believe can be closed 😃!
@jbarlow83
@fnoy You have an old version of ocrmypdf.
@damnms What version of ocrmypdf do you have?
It’s quite possible this issue was fixed by #732 or commit 49734d which is in v10.2.1+