"Unreadable content" warning in Word document when exporting from PDF

Report · Oct 18, 2022

I am using the API to generate a .docx file from a PDF.

When I use this online converter tool, everything works perfectly: https://www.adobe.com/acrobat/online/pdf-to-word.html

However when I use the API with the same PDF, my exported .docx file is corrupted. When I go to open the file in Word, I get the following message:

"Word found unreadable content in "Untitled Document.docx". Do you want to recover the contents of this document? If you trust the source of this document, click Yes."

If I click "Yes", the file opens just fine. But I can't figure out what could be going wrong for Word to be giving me this message.

Here is my code for sending the request:

headers = {
 'Authorization': "Bearer #{access_token}",
 'Accept': 'application/json, text/plain, */*',
 'x-api-key': client_id,
 'Prefer': "respond-async,wait=0",
 'Content-Type': "multipart/form-data"
}

j = {
 "cpf:engine": {
  "repo:assetId": "urn:aaid:cpf:Service-26c7fda2890b44ad9a82714682e35888"
 },
 "cpf:inputs": {
  "params": {
   "cpf:inline": {
    "targetFormat": "docx"
   }
  },
  "documentIn": {
   "dc:format": "application/pdf",
   "cpf:location": "InputFile0"
  }
 },
 "cpf:outputs": {
  "documentOut": {
   "dc:format": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
   "cpf:location": "multipartLabelOut"
  }
 }
}

body = {
 "contentAnalyzerRequests": j.to_json,
 "InputFile0": file.open
}

resp = post(url, body, headers)

Report · Nov 30, 2022

So a few things. If you are using the API directly and not via SDK, remember that the response is NOT a pure binary file, but a multipart form response that has the binary data inside. It sounds like you are opening the response as is, and not parsing it. You will need to do that in order to get to your file.

Secondly, we recently released a new API that's far simpler to use. You may want to consider using that. Here is an article on it: https://blog.developer.adobe.com/announcing-the-new-adobe-document-services-rest-apis-8d85951176cf?s...

Report · Apr 17, 2023

Thanks, I ended up upgrading to the new API and it works