question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

--unpaper-args and --clean-final

See original GitHub issue

In the release notes it is stated that

The argument --clean-final now implies --clean.

In the documentation however the following commands are shown as an example:

ocrmypdf --clean --clean-final --unpaper-args ‘–layout double’ input.pdf output.pdf ocrmypdf --clean --clean-final --unpaper-args ‘–layout double --no-noisefilter’ input.pdf output.pdf

So are those examples outdated, or do I misinterpret the statement in the release notes, since in those examples --clean could be omitted?

Additionally, in another thread you wrote

I am thinking that I will change the behavior so that --unpaper-args implies --clean and --clean-final.

Do you still plan to implement that?

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:9

github_iconTop GitHub Comments

1reaction
jbarlow83commented, Jun 17, 2019

@githubnavigator The idea would be that all of that functionality could be packaged in a short script: ocrmypdf-poster in.pdf out.pdf.

Generally I want to make it easier for people to customize things and attract more contributions.

I am both working on making ocrmypdf usable as a library and making the library extensible. That should make it easier to implement page splitting, but I don’t think would be doable with plugins alone. As it currently looks the way to do that would be use to use components of ocrmypdf as a library in a custom command line driver.

1reaction
githubnavigatorcommented, Jun 14, 2019

Oh. Well, if possible, if you could please leave the “–unpaper-args” option in there, I’d really appreciate it. A separate Python script would significantly increase the complexity for me.

I was able to achieve my desired goals using these commands:

mutool poster -x 2 -y 1 in.pdf out.pdf
ocrmypdf --force-ocr --unpaper-args '--layout single' --deskew --clean-final in.pdf out.pdf
Read more comments on GitHub >

github_iconTop Results From Across the Web

unpaper-args and --clean-final · Issue #392 - GitHub
A new interface will allow plugins, making it possible to write a short script that has the effect of --unpaper-args . This avoids...
Read more >
Advanced features - OCRmyPDF - Read the Docs
OCRmyPDF uses unpaper to provide the implementation of the --clean and --clean-final arguments. unpaper provides a variety of image processing filters to ...
Read more >
ocrmypdf Documentation - Read the Docs
Added a feature, --unpaper-args, which allows passing arbitrary arguments to unpaper when using --clean or --clean-final. The default, very conservative ...
Read more >
OcrMyPdf Python: Permission denied: 'unpaper' - Stack Overflow
I'm trying to use ocrMyPdf library and here is my code: ocrmypdf.ocr("input/mypdf.pdf", "input/mypdf_ocr.pdf", skip_text=False, ...
Read more >
OCRmyPDF 11.2.0 - Fresh FOSS
Tighten unpaper-args validation to exclude. and. . pngquant driver: refactor, ... For --clean-final, use same image as --clean if possible.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found