• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers

"Unreadable content" warning in Word document when exporting from PDF

New Here ,
Oct 18, 2022 Oct 18, 2022

Copy link to clipboard

Copied

I am using the API to generate a .docx file from a PDF.

 

When I use this online converter tool, everything works perfectly: https://www.adobe.com/acrobat/online/pdf-to-word.html

 

However when I use the API with the same PDF, my exported .docx file is corrupted. When I go to open the file in Word, I get the following message:

 

"Word found unreadable content in "Untitled Document.docx". Do you want to recover the contents of this document? If you trust the source of this document, click Yes."

 

If I click "Yes", the file opens just fine. But I can't figure out what could be going wrong for Word to be giving me this message.

 

Here is my code for sending the request:

headers = {
 'Authorization': "Bearer #{access_token}",
 'Accept': 'application/json, text/plain, */*',
 'x-api-key': client_id,
 'Prefer': "respond-async,wait=0",
 'Content-Type': "multipart/form-data"
}

j = {
 "cpf:engine": {
  "repo:assetId": "urn:aaid:cpf:Service-26c7fda2890b44ad9a82714682e35888"
 },
 "cpf:inputs": {
  "params": {
   "cpf:inline": {
    "targetFormat": "docx"
   }
  },
  "documentIn": {
   "dc:format": "application/pdf",
   "cpf:location": "InputFile0"
  }
 },
 "cpf:outputs": {
  "documentOut": {
   "dc:format": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
   "cpf:location": "multipartLabelOut"
  }
 }
}

body = {
 "contentAnalyzerRequests": j.to_json,
 "InputFile0": file.open
}

resp = post(url, body, headers)

 

TOPICS
Bug , PDF Extract API , PDF Services API , REST APIs

Views

85

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Employee ,
Nov 30, 2022 Nov 30, 2022

Copy link to clipboard

Copied

LATEST

So a few things. If you are using the API directly and not via SDK, remember that the response is NOT a pure binary file, but a multipart form response that has the binary data inside. It sounds like you are opening the response as is, and not parsing it. You will need to do that in order to get to your file. 

 

Secondly, we recently released a new API that's far simpler to use. You may want to consider using that. Here is an article on it: https://blog.developer.adobe.com/announcing-the-new-adobe-document-services-rest-apis-8d85951176cf?s...

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources