Skip to content

[3rdparty]: DecompressionBombError with default max_image_mpixels value — effective PIL limit is only 300 pixels #1664

@goggo68

Description

@goggo68

Simple sanity checks

  • This is an issue with an app that uses OCRmyPDF for OCR
  • I am using a recent version of the third party app
  • I will include a file that reproduces the issuse

Third party app name and version

ocrmypdf 16.12.0, Pillow (in venv), Python 3.13

Describe the bug

Environment: Paperless-NGX native install, Debian 13, Python venv at /opt/paperless/.venv
Describe the bug
When processing scanned PDFs (~10 MP, typical smartphone/scanner output), ocrmypdf raises a DecompressionBombError with an unexpectedly low limit:
DecompressionBombError: Image size (10385522 pixels) exceeds limit of 600 pixels,
could be decompression bomb DOS attack.
The limit of 600 is 2 * PIL.Image.MAX_IMAGE_PIXELS. This means MAX_IMAGE_PIXELS is being set to 300, which is clearly unintentional.
Root cause
In ocrmypdf/_validation.py:
pythondef check_options_pillow(options: Namespace) -> None:
PIL.Image.MAX_IMAGE_PIXELS = int(options.max_image_mpixels * 1_000_000)
if PIL.Image.MAX_IMAGE_PIXELS == 0:
PIL.Image.MAX_IMAGE_PIXELS = None
The default value of options.max_image_mpixels results in int(... * 1_000_000) = 300, which is far below any real-world image size. The guard if == 0 never triggers. The result is that all non-trivial images fail with a DecompressionBombError.
Expected behavior
The default should either preserve Pillow's own default (~89 MP) or be set to None (no limit). A limit of 300 pixels effectively breaks OCR for any real document.
Workaround
bashsed -i 's/PIL.Image.MAX_IMAGE_PIXELS = int(options.max_image_mpixels * 1_000_000)/PIL.Image.MAX_IMAGE_PIXELS = None/'
/path/to/.venv/lib/python3.13/site-packages/ocrmypdf/_validation.py

Steps to reproduce

1. Import attached file into Paperless-ngx
2. Trigger OCR
3. Check log file
4. ...

Files

No response

OCRmyPDF version

No response

Relevant log output


Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions