Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

--unpaper-args and --clean-final

See original GitHub issue

In the release notes it is stated that

The argument --clean-final now implies --clean.

In the documentation however the following commands are shown as an example:

ocrmypdf --clean --clean-final --unpaper-args ‘–layout double’ input.pdf output.pdf ocrmypdf --clean --clean-final --unpaper-args ‘–layout double --no-noisefilter’ input.pdf output.pdf

So are those examples outdated, or do I misinterpret the statement in the release notes, since in those examples --clean could be omitted?

Additionally, in another thread you wrote

I am thinking that I will change the behavior so that --unpaper-args implies --clean and --clean-final.

Do you still plan to implement that?

Issue Analytics

State:
Created 4 years ago
Comments:9

Top GitHub Comments

1reaction

jbarlow83commented, Jun 17, 2019

@githubnavigator The idea would be that all of that functionality could be packaged in a short script: ocrmypdf-poster in.pdf out.pdf.

Generally I want to make it easier for people to customize things and attract more contributions.

I am both working on making ocrmypdf usable as a library and making the library extensible. That should make it easier to implement page splitting, but I don’t think would be doable with plugins alone. As it currently looks the way to do that would be use to use components of ocrmypdf as a library in a custom command line driver.

1reaction

githubnavigatorcommented, Jun 14, 2019

Oh. Well, if possible, if you could please leave the “–unpaper-args” option in there, I’d really appreciate it. A separate Python script would significantly increase the complexity for me.

I was able to achieve my desired goals using these commands:

mutool poster -x 2 -y 1 in.pdf out.pdf
ocrmypdf --force-ocr --unpaper-args '--layout single' --deskew --clean-final in.pdf out.pdf

Top Results From Across the Web

unpaper-args and --clean-final · Issue #392 - GitHub

A new interface will allow plugins, making it possible to write a short script that has the effect of --unpaper-args . This avoids...

Advanced features - OCRmyPDF - Read the Docs

OCRmyPDF uses unpaper to provide the implementation of the --clean and --clean-final arguments. unpaper provides a variety of image processing filters to ...

ocrmypdf Documentation - Read the Docs

Added a feature, --unpaper-args, which allows passing arbitrary arguments to unpaper when using --clean or --clean-final. The default, very conservative ...

OcrMyPdf Python: Permission denied: 'unpaper' - Stack Overflow

I'm trying to use ocrMyPdf library and here is my code: ocrmypdf.ocr("input/mypdf.pdf", "input/mypdf_ocr.pdf", skip_text=False, ...

OCRmyPDF 11.2.0 - Fresh FOSS

Tighten unpaper-args validation to exclude. and. . pngquant driver: refactor, ... For --clean-final, use same image as --clean if possible.