Perform OCR and save it as PDF/A using C# Code

Report · Jun 25, 2024

Hello

I have a bunch of images that I want to OCR and save the result as PDF/A. I have visual studio installed and adobe acrobat pro installed and active.
using the adobe acrobat, I can OCR the image and save it as PDF/A, but am not able to do this using C# code. I added the COM reference to the project but when i execulte : var isOCR = acrobatApp.MenuItemExecute("ADBE:DocumentRecognizeText"); it returns false.

the same for var isPDFA = acrobatApp.MenuItemExecute("SaveAsOther:PDF/A");

can you help please?

Report · Jun 26, 2024

You're looking for a commercial C# SDK like - https://pspdfkit.com/pdf-library/dotnet/. You can both OCR (with Key Value Pairs as well) and save it as PDF/A compliant in .NET C#.

Report · Jun 26, 2024

Acrobat is sandboxed, meaning that operations that may pose a security risk are restricted. Some are restricted to use within a specific context and some are blocked completely. Some Menu items are restricted to use from a privileged context. But this is controlled by a whitelist that you can change.

Here are a couple of useful links:

https://www.adobe.com/devnet-docs/acrobatetk/tools/PrefRef/Windows/index.html

https://community.adobe.com/t5/acrobat-sdk-discussions/how-to-whitelist-quot-app-execmenuitem-quot-a...

Thom Parker - Software Developer at PDFScripting
Use the Acrobat JavaScript Reference early and often

Perform OCR and save it as PDF/A using C# Code

Photos