PDF generation API error on recent DOCX

Report · Apr 03, 2024

Hello,
I use the PDF generation API to send docx template + array to retrieve pdf. All was working fine since end of March.
A working docx template will no longer work if i save it again (like add space + remove + save).
the problem occurs only when converting in pdf: template docx to docx seem working fine.

'cpf:inline' => [
                        'outputFormat'     => 'pdf',
                        'jsonDataForMerge' => ['toto' => 'toto'],
                    ],

outputFormat as PDF trigger the error.

here is the UNSUPPORTED_OR_CORRUPTED_TEMPLATE error :

\"cpf:status\":{\"completed\":true,\"type\":\"\",\"title\":\"Either unsupported or corrupted file content provided. The only supported format is docx; transactionId=XpY0r4AgreDVkynOlMAPp60Qau7COeTL\",\"status\":400,\"report\":\"{\\\"error_code\\\":\\\"UNSUPPORTED_OR_CORRUPTED_TEMPLATE\\\",\\\"source\\\":\\\"docgen_engine\\\"}\"}

It seems there is may be a conflict with the Microsoft Word update and Adobe API service ?
You can reproduce event with an empty template

Report · Apr 03, 2024

Are you using the old REST API? If so - please switch to the new one and test there.

Report · Apr 03, 2024

ok, thanks i will try for basic test, can you let me know when the "old" API will ne longer be usable ?

Report · Apr 03, 2024

I don't think we're taking it away, but we aren't providing any suupport for it.

Report · Apr 04, 2024

just tested with the new API, working fine with my olds docx.
with new docx still get an error :

{
    "error": {
        "code": "CORRUPT_DOCUMENT",
        "message": "The input file appears to be corrupted and cannot be processed.; requestId=ZdkUlF4mpA4hTS6P7Z4t41IdPQkohvwo",
        "status": 400
    },
    "status": "failed"
}

It seems to be the same problem

Report · Apr 04, 2024

Can you share your code, specifically where you upload your source file?

Report · Apr 04, 2024

i was tesing on POSTMAN

POST on https://pdf-services-ue1.adobe.io/operation/documentgeneration
replace the uri with the attached empty docx file (tested with AWS)

{
  "input": {
    "uri": "xxxx"
  },
  "params": {
    "outputFormat": "pdf",
    "jsonDataForMerge": {
      "customerName": "Kane Miller",
      "customerVisits": 100,
      "itemsBought": [
        {
          "name": "Sprays",
          "quantity": 50,
          "amount": 100
        },
        {
          "name": "Chemicals",
          "quantity": 100,
          "amount": 200
        }
      ],
      "totalAmount": 300,
      "previousBalance": 50,
      "lastThreeBillings": [
        100,
        200,
        300
      ],
      "photograph": "data&colon;image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mP88h8AAu0B9XNPCQQAAAAASUVORK5CYII="
    }
  }
}

Report · Apr 04, 2024

tested by changing the output :

"outputFormat": "docx",

same problem docx is working fine, not pdf

Report · Apr 04, 2024

You mentioned the doc was empty. If you add -anything- to the Word doc template, does it work?

Report · Apr 05, 2024

i tested it with my complete template in the first place. I just send to you and tested with an empty template to avoid any other interferance because it should work, and to show you that the problem is probably a conflict between adobe service api and a recent update of microsoft word.
My backup solution for the moment is to use google doc.

Report · Apr 09, 2024

Problem can be reproduce on the demo API https://acrobatservices.adobe.com/dc-docgen-playground/index.html#/
can we have any update on the fix ?

Report · Apr 04, 2024

I'm facing the same issue, except that I'm using the node sdk.

Report · Apr 04, 2024

@Yashwanth36533953k2p5 can you share your Word doc and code?

Report · Apr 04, 2024

import PDFServicesSdk from "@adobe/pdfservices-node-sdk";
import * as batcher from "./batcher.js";

export async function createPdfs(templatesComponents) {
  const promises = [];

  for (const templateComponents of templatesComponents) {
    const { input, templatePath, outputPath } = templateComponents;
    const job = async () => await createPdf(input, templatePath, outputPath);
    const promise = batcher.pushJob(job);
    promises.push(promise);
  }

  return await Promise.all(promises);
}

export async function combinePdfs(filePaths, outputPath) {
  const job = async () => await _combinePdfs(filePaths, outputPath);
  return await batcher.pushJob(job);
}

async function _combinePdfs(filePaths, outputPath) {
  // Initial setup, create credentials instance.
  const executionContext = getExecutionContext();
  const combineFilesOperation =
    PDFServicesSdk.CombineFiles.Operation.createNew();

  // Set operation input from a source file.
  for (const filePath of filePaths) {
    const source = PDFServicesSdk.FileRef.createFromLocalFile(filePath);
    combineFilesOperation.addInput(source);
  }

  // Execute the operation and Save the result to the specified location.
  const result = await combineFilesOperation.execute(executionContext);
  await result.saveAsFile(outputPath);
  return outputPath;
}

async function createPdf(input, templatePath, outputPath) {
  const executionContext = getExecutionContext();

  // Create a new DocumentMerge options instance.
  const documentMerge = PDFServicesSdk.DocumentMerge;
  const documentMergeOptions = documentMerge.options;

  const options = new documentMergeOptions.DocumentMergeOptions(
    input,
    documentMergeOptions.OutputFormat.PDF
  );

  // Create a new operation instance using the options instance.
  const documentMergeOperation = documentMerge.Operation.createNew(options);

  // Set operation input document template from a source file.
  const template = PDFServicesSdk.FileRef.createFromLocalFile(templatePath);
  documentMergeOperation.setInput(template);

  // Execute the operation and Save the result to the specified location.
  const result = await documentMergeOperation.execute(executionContext);
  await result.saveAsFile(outputPath);
  return outputPath;
}

function getExecutionContext() {
  // Initial setup, create credentials instance.
  const credentials =
    PDFServicesSdk.Credentials.serviceAccountCredentialsBuilder()
      .fromFile("credentials/pdfservices-api-credentials.json")
      .build();

  const clientConfig = PDFServicesSdk.ClientConfig.clientConfigBuilder()
    .withConnectTimeout(20000)
    .withReadTimeout(20000)
    .build();

  // Create an ExecutionContext using credentials.
  return PDFServicesSdk.ExecutionContext.create(credentials, clientConfig);
}

Report · Apr 05, 2024

Nothing sticks out. I assume you can verify that "input" is data that is an object at the top level, right? What I mean is, the data sent to doc gen must be an object at the top level, not an array.

Report · Dec 17, 2024

Hi @Raymond Camden, I think that the problem is the version of Word that is used to generate .docx template.

If I donwload samples in Adobe playground (https://acrobatservices.adobe.com/dc-docgen-playground/index.html#/) and use it without any change they work well.

But if I change anithing in Word template I receive the error :

"code": "CORRUPT_DOCUMENT",
"message": "The input file appears to be corrupted and cannot be processed.;

Attached I put the original file, the changed and Json that you can use to view this.

Report · Jan 07, 2025

It seems that the problem reappears yes. If you need an empty docx to work on let met know

Report · Jan 07, 2025

so people know a workaround is to not use Microsoft Word (Google Doc working fine) awaiting fix

Report · Jan 12, 2025

I m ahvaing the same problem. I am using Java SDK and developed the full solution 4-5 weeks back for docx templates based PDF generation. I started working on it to prepare a demo for my leadership team and faced a road block. Any time I upload my new word docx template and generate PDF file after merging data, I get statusCode=400; errorCode=CORRUPT_DOCUMENT when polling for result. My old templates are still working fine but if I re-save the old template and generate PDF, I get the same CORRUPT_DOCUMENT error.

Amazingly, If I set outputFormt.DOCX, then I do not get any error. It does return a docs after mail merged data.

IF only happens when withOutputFormat(OutputFormat.PDF) is used.

The work around for me is to upload the doc to GoogleDoc, sane it as GoogleDoc and then download it as Word docx. But this process changes the template format when tables are used in text area.

Looking for guidance or fix so I cna prepare for demo with 4-5 templates and get approval for onboarding the service.

To re-produce this issue, one can us ethe sample MergeDocumentToPDF.java provided with Java SDK . if you use the template provided in the sampels, it works fine. When you re-save the temapltes using the words and run, you will get the same CORRUPT_DOCUMENT error.

Report · Jan 07, 2025

I don't think that an empty docx can be a solution because I need to alter them to build the template.

Report · Jan 07, 2025

I didn't know that the template can be build in Google Doc. Have any link to documentation about this? Exist on plugin to gdoc like to the MS word?

Report · May 14, 2024

Any updates about this? I have the same issue with template docx & json file from the playground https://acrobatservices.adobe.com/dc-docgen-playground/index.html#/ . Here's my postman collection https://drive.google.com/file/d/1GWFzKj5-uzn2Ed4JlJY6oVS4dnIfyaGL/view?usp=sharing

Report · Feb 23, 2025

I'm getting the same kind of problem.

New documents are generating ok, however some of my existing documents no longer generate. Even if I delete all content from the document it is still rejected as corrupted.

I've attached a minimal document example.

Unzipping the document the only thing thats obvious is that some 'media' has been retained even though it is not referenced in the content.

PDF generation API error on recent DOCX

Photos