DecompressionBombError: Image size exceeds limit of x pixels, could be decompression bomb DOS attack
See original GitHub issueDescribe the issue
Forcing an OCR results in PIL.Image.DecompressionBombError
To Reproduce
$ ocrmypdf -v -l deu+eng --force-ocr --sidecar out.txt /path/to/input.pdf out.pdf
DEBUG - ocrmypdf 9.0.0
DEBUG - os.symlink(/path/to/input.pdf, /var/folders/pf/7vhqx5bn41qddypw08w9jc4w0000gn/T/com.github.ocrmypdf.civmiyes/origin)
DEBUG - os.symlink(/var/folders/pf/7vhqx5bn41qddypw08w9jc4w0000gn/T/com.github.ocrmypdf.civmiyes/origin, /var/folders/pf/7vhqx5bn41qddypw08w9jc4w0000gn/T/com.github.ocrmypdf.civmiyes/origin.pdf)
Scan: 100%|...| 1/1 [00:00<00:00, 59.04page/s]
INFO - Using Tesseract OpenMP thread limit 4
INFO - 1: page already has text! - rasterizing text and running OCR anyway
DEBUG - 1: Rasterize with png16m
DEBUG - 1: ['gs', '-dQUIET', '-dSAFER', '-dBATCH', '-dNOPAUSE', '-sDEVICE=png16m', '-dFirstPage=1', '-dLastPage=1', '-r2897.060413x2897.060413', '-o', '/var/folders/pf/7vhqx5bn41qddypw08w9jc4w0000gn/T/tmpeam5vcbf', '-dAutoRotatePages=/None', '-f', '/var/folders/pf/7vhqx5bn41qddypw08w9jc4w0000gn/T/com.github.ocrmypdf.civmiyes/origin.pdf']
OCR: 0%| | 0.0/1.0 [00:00<?, ?page/s]
ERROR - An exception occurred while executing the pipeline
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/local/opt/python/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "/usr/local/Cellar/ocrmypdf/9.0.0/libexec/lib/python3.7/site-packages/ocrmypdf/_sync.py", line 100, in exec_page_sync
remove_vectors=False,
File "/usr/local/Cellar/ocrmypdf/9.0.0/libexec/lib/python3.7/site-packages/ocrmypdf/_pipeline.py", line 446, in rasterize
filter_vector=remove_vectors,
File "/usr/local/Cellar/ocrmypdf/9.0.0/libexec/lib/python3.7/site-packages/ocrmypdf/exec/ghostscript.py", line 175, in rasterize_pdf
with Image.open(tmp) as im:
File "/usr/local/Cellar/ocrmypdf/9.0.0/libexec/lib/python3.7/site-packages/PIL/Image.py", line 2690, in open
im = _open_core(fp, filename, prefix)
File "/usr/local/Cellar/ocrmypdf/9.0.0/libexec/lib/python3.7/site-packages/PIL/Image.py", line 2677, in _open_core
_decompression_bomb_check(im.size)
File "/usr/local/Cellar/ocrmypdf/9.0.0/libexec/lib/python3.7/site-packages/PIL/Image.py", line 2611, in _decompression_bomb_check
(pixels, 2 * MAX_IMAGE_PIXELS))
PIL.Image.DecompressionBombError: Image size (811374000 pixels) exceeds limit of 256000000 pixels, could be decompression bomb DOS attack.
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/Cellar/ocrmypdf/9.0.0/libexec/lib/python3.7/site-packages/ocrmypdf/_sync.py", line 338, in run_pipeline
exec_concurrent(context)
File "/usr/local/Cellar/ocrmypdf/9.0.0/libexec/lib/python3.7/site-packages/ocrmypdf/_sync.py", line 269, in exec_concurrent
page_result = results.next()
File "/usr/local/opt/python/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/pool.py", line 748, in next
raise value
PIL.Image.DecompressionBombError: Image size (811374000 pixels) exceeds limit of 256000000 pixels, could be decompression bomb DOS attack.
Example file
The file contains personal information. Happy to provide the encrypted version though.
Expected behavior
It worked for many similar PDFs. I would have expected this to just work the same way.
System:
- OS: macOS 10.13.6
- OCRmyPDF Version: 9.0.0
Additional context
This is not a scanned but rather a generated PDF.
Issue Analytics
- State:
- Created 4 years ago
- Comments:11
Top Results From Across the Web
Pillow in Python won't let me open image ("exceeds limit")
PIL.Image.DecompressionBombError: Image size (933120000 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack.
Read more >PIL.Image.DecompressionBombError · Issue #76 - GitHub
Image.DecompressionBombError: Image size (280835475 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack.
Read more >PIL.Image.DecompressionBombError: Image size exceeds ...
PIL.Image.DecompressionBombError: Image size (449925840 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack.
Read more >Image size (329910267 pixels) exceeds limit of 178956970 ...
ERROR:root:error: Image size (329910267 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. Any ideas how to overcome this ...
Read more >opencv-cvat/public - Gitter
DecompressionBombError : Image size (211,680,000) exceeds limit of 178,956,970 pixels, could be decompression bomb DOS attack... ". So how can I increase the ......
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
--image-dpi
has no effect except when the input file is an image. There’s supposed to be a warning message that explains this (although maybe it should be an error).You can use Ghostscript to downsample:
In its documentation there are commands to fine tune the resolution.
Since Ghostscript can do this, I don’t think ocrmypdf should implement the same feature.
Solved Open File “/usr/local/Cellar/ocrmypdf/9.0.0/libexec/lib/python3.7/site-packages/PIL/Image.py” and Change MAX_IMAGE_PIXELS = How much you want.