Compression with /FlateDecode using /Filter and Predictor

Report · Nov 22, 2024

Hi,
A few months ago I found that Photoshop can´t save Photoshop PDF images that have a dimension larger than 30,000 pixels. I needed it to convert CMYK images with spot colors (another are very ignored area) to PDF.

Therefore, I wrote my own tool in Python that does that: takes TIFF images in CMYK or monochrome (gray levels) and assembles them into a PDF, any size, with less than 20 objects. I used a wonderful module (mPdf.py) from Didier Stevens and the zlib library to compress the bitmaps / It works nicely, although Acrobat is slow displaying a 40,000x20,000 pixel image with color planes

Now the question: has anybody used zlib in Python, in an advanced manner?
I have used zlib and compressed the bitmaps, but I know that if I use /Filter /FlateDecode and add /DecodeParams with Predictor 2, the bitmaps from TIFF images will compress much better. However, have not found info on how to use it rather than the original zlib C code, which indicates a setting for the filter should be specified (done) and a dictionary with the most frequent strings of bytes that are going to be compressed. What is unclear to me is this specific point, how to set up that 'dictionary', a complete different concept in Python from C, to drive the optimized/predicted compression.

Also, has anybody played with the level parameter (0 to 9, to choose between size and performance)?

Thanks in advance,
Ignacio