We're getting different result output when we use the COM to export a PDF into a docx vs manually opening a PDF in Acrobat and exporting to docx. It seems the manual version looks much cleaner while via COM looks old and dated.
Anyone know why?
What method are you using via COM to do this?
via COM the method is Open File -> GetJSObject -> SaveAs (set as com.adobe.acrobat.docx).
That call is equivalent to the UI, with the settings/preferences for the conversion.
Of course, make sure that you only have a single copy of Acrobat on your machine and that you are only doing this on a desktop and not a server environment.
I'll reinstall everything to double check, but it should be the latest creative cloud acrobat on desktop. Now I know visually when exporting a docx you can set a language standard (if detected) for OCR, how do we fulfill that from the COM request?
As for a server environment, do you guys have a solution for that too?
You have to set the defaults in the UI, there is no API for that via COM/JS.
Yes, for server side we offer our AEM Forms product which includes this technology.
Gotcha, so it is limited to the auto detect language option then.
AEM Forms -> the goal is to just take in a PDF and spit back a docX file via some sort of API/COM, is that what that is?
AEM Forms does many things – the export to DocX is just one of them.