Skip to main content
May 17, 2016
Question

Can Acrobat do batch conversion of 100,000+ documents in multiple native formats into PDFs with OCR-English and OCR-Spanish?

  • May 17, 2016
  • 3 replies
  • 976 views

I am trying to figure out the right kind of software for a specific conversion job. I am not very tech-literate so please keep any answers in plain language.

We anticipate having 100,000+ documents in multiple native formats (definitely Office formats but also emails and perhaps some others common to an office environment) that we need to convert to PDFs. Most documents will be in English but a small number would be in Spanish. They would need to be searchable multiple page PDF format, 300 dpi, and must be OCR-English or OCR-Spanish depending on the language of the original document.

Is it possible and fairly straightforward for a person competent with the program to do a batch conversion of a huge file of these documents (or divided into separate files by language and/or file type if necessary) using Acrobat? Judging by basic information available it seems like it should be possible, but I have received one opinion from someone knowledgeable that it would not be straightforward and that people with some expertise would need to create some scripts to complete this and that they might take 12 hours to create.

Thank you.

This topic has been closed for replies.

3 replies

Legend
May 17, 2016

Acrobat is not suitable for such a job, as it has been stated.

One of the first addresses to look for industrial-strength solutions would be PDF-Tools AG. They have a good range of such products, and they can also customize for your specific needs (in other words, they know what they are doing).

You might also look at the members of PDFA.org .

Hope this can help.

try67
Community Expert
Community Expert
May 17, 2016

Definitely not. Acrobat is not built for processing this amount of files in a single operation. Maximum number you should count on is around 300 at a time. You need a much more robust application for this type of batch operation.

May 17, 2016

Also, the OCR would need to be done as a batch too - not manually for each document. We want each PDF to be a separate document, not combined into one.