Skip to main content
GioTavanlar
Participant
August 31, 2022
Question

FIle SIze Reduction vs. Searchability - which to do first?

  • August 31, 2022
  • 1 reply
  • 337 views

The task: Make an image-heavy PDF searchable and not "too large" to upload.

My questions:

Does the reduction of a PDF's file size mess with its searchability, so much so it is best to wait till the PDF is optimized before running the OCR? 

Or does making it searchable only increase the file size thus  making size reduction the more logical final step in the process? 

 

Help sought. Bad.

 

 

 

 

This topic has been closed for replies.

1 reply

gary_sc
Community Expert
Community Expert
August 31, 2022

The primary thing that can affect size are images. If the original images are jpgs, you can decrease their storage size by compression. But the greater the compression, the greater the images quality goes down. 

then, with any image, the higher the resolution, the greater the storage size. Is you double the resolution, the storage size goes up 4x. Where people often go wrong is using an image larger than they need. For example, if you need a resolution of 300 poi, a 2" image is 600 pixels wide. If you take an image from a phone and place it directly into the document, it is likely to have a width of 4000-5000 pixels. That's a lot more than you need and significantly increases the size of the document. 

but, to your question of the search ability to affect storage size, we'll, that's pretty irrelevant. 

I hope that helps to some degree. 

Legend
August 31, 2022

Personally, I'd say, firstly, reducing file size may affect FUTURE work on search ability very much. Because OCR depends on good quality originals, and reducing size is always reducing quality. Secondly, OCR can replace images with text, so it may hugely reduce file size, finishing the job. Check OCR options carefully.