Converting old scanned PDFs to clean documents

Report · Jun 25, 2020

I have a collection of books in old scanned PDF files. It looks like someone laid each page flat on a scanner so you can see page edges and shadows in the creases.

Is there a way to automatically upload, scan, and convert these old poorly scanned files into clean Google Drive or Word documents?

Report · Jun 25, 2020

Hi Pledgerar,

In a word, well, two words: yes and no.

It is easy to load the PDFs to convert to searchable text.

It is not easy to work with the folds. What you will get is very hit and miss.

I'm assuming that you've got the lastest Acrobat Pro DC, my screenshots will refer to that. If not, my screenshots will get you in the right direction or not depending on how old your version is.

Go to the Scan & OCR Tool and from the middle of the works selection area select Recognize Text -> in Multiple files.

Then from the upper left dropdown menu you can select multiple files, a folder/s or open documents.

Then let it run.

Unfortunatley, this is not all that much different than other things in life: garbage in may give you garbage out. If you have poor quality of things going in you are not likely to get great results going out. The fact that OCR works at all I've always found to be a miracle, but miracles only go so far. To recognize the text on a curve with varying level of shadow to dark is a big barrier to overcome.

That notwithstanding, if you are willing to recognize that one barrier, go for it. Just be aware that there will be problems.

Let me know if you need any more guidance on this.

Adobe Community

Converting old scanned PDFs to clean documents