Skip to main content
Participant
July 22, 2023
Question

File size reduced when edit the scanned file. How it was happened!!!

  • July 22, 2023
  • 4 replies
  • 1376 views

I edit the Scanned PDF with acrobat reader licensed version , i.e, added one space between two words and save it , i wondered that the file size is reduced from 5.6 MB to 0.6 MB. I did not understand what the process is going internally can any one please let me know the answer why it is happened. Thanks

This topic has been closed for replies.

4 replies

Participating Frequently
October 14, 2023

Thank you for your participation.

Participating Frequently
July 23, 2023

Always check the line spacing and fonts, people commonly use 1.5 rather than 2.0.

More Gigs in your hard-drive means more capacity for storage.

Abambo
Community Expert
Community Expert
September 18, 2023
quote

Always check the line spacing and fonts, people commonly use 1.5 rather than 2.0.

More Gigs in your hard-drive means more capacity for storage.


By @AdrianS12326318469

Nonsense answer!

ABAMBO | Hard- and Software Engineer | Photographer
Abambo
Community Expert
Community Expert
July 22, 2023
quote

I edit the Scanned PDF with acrobat reader licensed version , i.e, added one space between two words and save it 


By @seabird karwar Site5C80

With Acrobat Reader? Reader, as the name says reads only PDF files, but cannot edit them (as a general rule).

 

You transformed an image of a certain quality into a text. For a text, you only need to store a byte per character (depending on the coding some bytes more, but who cares). For a full colour image, you need 3 bytes per pixel in the image. A character as a pixel images needs many pixels to be readable. Optical character recognition tries to read the pixel image and to detect the characters and the text. The resulting data is indeed a text. Only with text will you be able to add a space somewhere. 

 

However, OCR is not perfect and can introduce errors and misrecognitions, depending on the scan quality and the complexity of the document. Check for errors in the document.

 

A different possibility to save space is to optimize an image for size, reducing the image quality, but reducing also the need for storage. The trick here is to reduce the quality only so much, that it is not seen by the eye. It may be enough saving a TIFF coded image as a JPEG coded image to save considerable space. It's a tradeoff for size vs quality. This won't allow, however, for text edits, but it is very well possible that the final result is a mix of both.

ABAMBO | Hard- and Software Engineer | Photographer
gary_sc
Community Expert
Community Expert
July 22, 2023

Hi, @seabird karwar Site5C80, this is very normal and expected.

 

Here's the issue: when you scan a page, the entire page is one image. When I scan a full letter-sized page and save it as a tif image (strongly recommended), it will be about 8 MB. After I OCR that page, it's typically anywhere from 80-160 kb. That's because I turned a one-page-sized image into just vector images of the text. And vector images are (almost) always smaller than a pixel-based image. 

 

BTW, here's a piece of trivia: if you have a 600-pixel wide image, and resize it down to a 300-pixel wide image, the storage size of that image will be 25% of the original image. What I'm getting at here is that a full-page pixel image can have a very large storage size.

 

Keep in mind, though, that there will be exceptions to the size issue, and this mostly depends on how the original image was scanned — a poorly scanned image will have a variety of problems, one of which is it's storage size.

 

I hope that helps

Participating Frequently
July 23, 2023
  1. quote

    Hi, @seabird karwar Site5C80, this is very normal and expected.

     

    Here's the issue: when you scan a page, the entire page is one image. When I scan a full letter-sized page and save it as a tif image (strongly recommended), it will be about 8 MB. After I OCR that page, it's typically anywhere from 80-160 kb. That's because I turned a one-page-sized image into just vector images of the text. And vector images are (almost) always smaller than a pixel-based image. 

     

    BTW, here's a piece of trivia: if you have a 600-pixel wide image, and resize it down to a 300-pixel wide image, the storage size of that image will be 25% of the original image. What I'm getting at here is that a full-page pixel image can have a very large storage size.

     

    Keep in mind, though, that there will be exceptions to the size issue, and this mostly depends on how the original image was scanned — a poorly scanned image will have a variety of problems, one of which is it's storage size.

     

    I hope that helps


    By @gary_sc
  2. Hope this help:  https://helpx.adobe.com/document-cloud/help/document-cloud-on-acp.html
Abambo
Community Expert
Community Expert
September 18, 2023
ABAMBO | Hard- and Software Engineer | Photographer