"The input file appears to be corrupted and cannot be processed" for pdf to word convert

New Here ,
May 10, 2022 May 10, 2022

Copy link to clipboard

Copied

Hi Team,

I'm trying to use the Adobe PDF Services API to convert pdf to word using the Export end point.On calling the poll method I'm getting the below response.

{
    "cpf:inputs": {
        "params": {
            "cpf:inline": {
                "targetFormat""docx"
            }
        },
        "documentIn": {
            "cpf:location""InputFile",
            "dc:format""application/pdf"
        }
    },
    "cpf:engine": {
        "repo:assetId""urn:aaid:cpf:Service-26c7fda2890b44ad9a82714682e35888"
    },
    "cpf:status": {
        "completed"true,
        "type""",
        "title""For application/pdf mime-type The input file appears to be corrupted and cannot be processed.; transactionId=So6I44ey8ylmc0edFm8INJLgfFjOGpZJ",
        "status"400,
        "report""{\"error_code\":\"CORRUPT_DOCUMENT\"}"
    }
}
 
Following is the code where call to Export end point happens
var client = new RestClient(url); //https://cpf-ue1.adobe.io/ops/:create

var request = new RestRequest();
request.Method = Method.Post;
request.AddQueryParameter("respondWith=", queryparam); 
request.AddHeader("Authorization", token);
request.AddHeader("Accept", "application/json,text/plain,*/*");
request.AddHeader("x-api-key", clientId);
request.AddHeader("Prefer", "respond-async,wait=0");
request.AddHeader("content-type", "multipart/form-data; boundary=----boundary");
request.AddParameter("multipart/form-data; boundary=----boundary",
"------boundary\r\nContent-Disposition: form-data; name=\"contentAnalyzerRequests\"\r\n\r\n" + jsonString +
"\r\n------boundary\r\nContent-Disposition: form-data; name=\"InputFile\"\r\n\r\n" + path + // loacl Path were we keep the pdf file. 
"\r\n------boundary--", ParameterType.RequestBody);

RestResponse adobeResponse = client.ExecuteAsync(request).Result;

 

contentAnalyzerRequests body:
{"cpf:inputs":{"params":{"cpf:inline":{"targetFormat":"docx"}},"documentIn":{"cpf:location":"InputFile","dc:format":"application/pdf"}},"cpf:engine":{"repo:assetId":"urn:aaid:cpf:Service-26c7fda2890b44ad9a82714682e35888"},"cpf:outputs":{"documentOut":{"cpf:location":"multipartLabelOut","dc:format":"application/vnd.openxmlformats-officedocument.wordprocessingml.document"}}}

 

 

Response:

{"cpf:status":{"completed":false,"type":"","title":"In Progress","status":202},"cpf:engine":{"repo:assetId":"urn:aaid:cpf:Service-26c7fda2890b44ad9a82714682e35888"},"cpf:inputs":{"params":{"cpf:inline":{"targetFormat":"docx"}},"documentIn":{"cpf:location":"InputFile","dc:format":"application/pdf"}}}

TOPICS
PDF Services API

Views

300

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Employee ,
May 25, 2022 May 25, 2022

Copy link to clipboard

Copied

Hi, can you share your PDF? If it is private, you can email it to me directly at jedimaster@adobe.com.

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
May 31, 2022 May 31, 2022

Copy link to clipboard

Copied

Hi, I tried with any pdf file(not specific to our requirement) getting same error you can also reach out to Darakhshan Khan <darkhan@adobe.com> i have shared a sample pdf to her.

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Employee ,
May 31, 2022 May 31, 2022

Copy link to clipboard

Copied

Ok - then it may be in how you are using the REST API. I'm not familiar with the language you are using. Are you sure you are properly creating the multipart request and sending the binary data to the endpoint?

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
May 31, 2022 May 31, 2022

Copy link to clipboard

Copied

Yes the request is multipart since I'm getting initial response status as 202,I tried changing the above code to some other format got error as not multipart request.

Could you please elaborate on the binary data part.

Note: If I'm using the same contentAnalyzerRequests body in postman and the file path I'm able to convert the pdf.

Complete code is in c#

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Employee ,
Jun 01, 2022 Jun 01, 2022

Copy link to clipboard

Copied

To be clear, I meant the *response* is multipart. Your code takes it and saves it as is, but it needs to parse it first. 

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jun 03, 2022 Jun 03, 2022

Copy link to clipboard

Copied

response from Export end point gives status as 202,title":"In Progress", then as per document we call the poll method from there we are getting the "CORRUPT_DOCUMENT\" , so I'm not able to save any response.

We are following the below document

PDF Services API (adobe.com)

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Employee ,
Jun 03, 2022 Jun 03, 2022

Copy link to clipboard

Copied

I believe you want to wait till the result says Completed. From the docs: "the state of the request for e.g. In ProgressCompleted. In case of Failure, a descriptive error message will be returned"

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jun 07, 2022 Jun 07, 2022

Copy link to clipboard

Copied

In the response header of POST call a Poll location URL is returned to get the generated output file, while polling the same method we are getting the above "CORRUPT_DOCUMENT\" message.

Sudheer24402926bp60_1-1654610257386.png

 

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Employee ,
Jun 13, 2022 Jun 13, 2022

Copy link to clipboard

Copied

LATEST

Right - but I believe you can't get the location until the status is complete though. Did you wait for that?

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources