Which SDK do we need to buy for making a image PDF to a text PDF

Question

Hi all,Our requirements are very simple. We have numerous PDFs which are basically scanned copies of the letters that contain text and images. We want to convert them into searchable PDF without losing the document layout and other real image content like logos or pictures. We want to do it programmatically using Acrobate SDK or API and NOT using the Adobe Acrobat Pro DC viewer application. Please let me know which Acrobat SDK do we need to get. Old forums suggest SDKs like Capture or PDF Library or Acrobat SDK or LiveCycle ES. I am not sure which one of these SDKs do we need. These SDKs come with numerous additional functionalities that we don't even need. is there a basic SDK that we can buy for our simple requirement?Regards,Viral Sheth

Joel Geraci · Accepted Answer

Thanks Joel once again. But ABBYY is a third party library. Doesn't Adobe provide its own set of SDK or libraries to perform OCR on PDFs? I am really surprised that nobody from Adobe's sales team has even tried to approach a potential customer like me. Now I am kind of skeptical of how responsive their tech support team would be. I am trying some other products as well. Those vendors are actively in touch with me as soon as they saw a potential customer in me.

Lov435:

While I understand your frustration, I can tell you from experience (I helped launch Acrobat Capture and it's API) that when Adobe tried to supply PDF OCR capabilities, nearly everyone we talked to wanted to replace the built in OCR engine with their own... and the Adobe engine was really good... but OCR is one of those things that people have strong opinions about. By Adobe licensing the PDF Library to multiple 3rd party OCR developers, you're able to get the best of both worlds. You get to have some competition and choice in the actual OCR space while still getting Adobe technology when the resulting PDF file is created.

Joel Geraci · Answer

The Adobe PDF Library would be the tool you'd use to get at the images and then insert any recognized text back into the PDF but you're pretty much on your own to find a library that will deconstruct the page and perform the OCR.

http://www.datalogics.com/products/pdf/pdflibrary/

J-

Sign up

To post, reply, or follow discussions, please sign in with your Adobe ID.

Sign in to Adobe Community

To post, reply, or follow discussions, please sign in with your Adobe ID.

Scanning file for viruses.

This file cannot be downloaded