Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

"Unreadable content" warning in Word document when exporting from PDF

Community Beginner ,
Oct 18, 2022 Oct 18, 2022

I am using the API to generate a .docx file from a PDF.

 

When I use this online converter tool, everything works perfectly: https://www.adobe.com/acrobat/online/pdf-to-word.html

 

However when I use the API with the same PDF, my exported .docx file is corrupted. When I go to open the file in Word, I get the following message:

 

"Word found unreadable content in "Untitled Document.docx". Do you want to recover the contents of this document? If you trust the source of this document, click Yes."

 

If I click "Yes", the file opens just fine. But I can't figure out what could be going wrong for Word to be giving me this message.

 

Here is my code for sending the request:

headers = {
 'Authorization': "Bearer #{access_token}",
 'Accept': 'application/json, text/plain, */*',
 'x-api-key': client_id,
 'Prefer': "respond-async,wait=0",
 'Content-Type': "multipart/form-data"
}

j = {
 "cpf:engine": {
  "repo:assetId": "urn:aaid:cpf:Service-26c7fda2890b44ad9a82714682e35888"
 },
 "cpf:inputs": {
  "params": {
   "cpf:inline": {
    "targetFormat": "docx"
   }
  },
  "documentIn": {
   "dc:format": "application/pdf",
   "cpf:location": "InputFile0"
  }
 },
 "cpf:outputs": {
  "documentOut": {
   "dc:format": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
   "cpf:location": "multipartLabelOut"
  }
 }
}

body = {
 "contentAnalyzerRequests": j.to_json,
 "InputFile0": file.open
}

resp = post(url, body, headers)

 

TOPICS
Bug , PDF Extract API , PDF Services API , REST APIs
472
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Employee ,
Nov 30, 2022 Nov 30, 2022

So a few things. If you are using the API directly and not via SDK, remember that the response is NOT a pure binary file, but a multipart form response that has the binary data inside. It sounds like you are opening the response as is, and not parsing it. You will need to do that in order to get to your file. 

 

Secondly, we recently released a new API that's far simpler to use. You may want to consider using that. Here is an article on it: https://blog.developer.adobe.com/announcing-the-new-adobe-document-services-rest-apis-8d85951176cf?s...

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Apr 17, 2023 Apr 17, 2023
LATEST

Thanks, I ended up upgrading to the new API and it works

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources