PDF generation API error on recent DOCX
Copy link to clipboard
Copied
Hello,
I use the PDF generation API to send docx template + array to retrieve pdf. All was working fine since end of March.
A working docx template will no longer work if i save it again (like add space + remove + save).
the problem occurs only when converting in pdf: template docx to docx seem working fine.
'cpf:inline' => [
'outputFormat' => 'pdf',
'jsonDataForMerge' => ['toto' => 'toto'],
],
outputFormat as PDF trigger the error.
here is the UNSUPPORTED_OR_CORRUPTED_TEMPLATE error :
\"cpf:status\":{\"completed\":true,\"type\":\"\",\"title\":\"Either unsupported or corrupted file content provided. The only supported format is docx; transactionId=XpY0r4AgreDVkynOlMAPp60Qau7COeTL\",\"status\":400,\"report\":\"{\\\"error_code\\\":\\\"UNSUPPORTED_OR_CORRUPTED_TEMPLATE\\\",\\\"source\\\":\\\"docgen_engine\\\"}\"}
It seems there is may be a conflict with the Microsoft Word update and Adobe API service ?
You can reproduce event with an empty template
Copy link to clipboard
Copied
Are you using the old REST API? If so - please switch to the new one and test there.
Copy link to clipboard
Copied
ok, thanks i will try for basic test, can you let me know when the "old" API will ne longer be usable ?
Copy link to clipboard
Copied
I don't think we're taking it away, but we aren't providing any suupport for it.
Copy link to clipboard
Copied
just tested with the new API, working fine with my olds docx.
with new docx still get an error :
{
"error": {
"code": "CORRUPT_DOCUMENT",
"message": "The input file appears to be corrupted and cannot be processed.; requestId=ZdkUlF4mpA4hTS6P7Z4t41IdPQkohvwo",
"status": 400
},
"status": "failed"
}
It seems to be the same problem
Copy link to clipboard
Copied
Can you share your code, specifically where you upload your source file?
Copy link to clipboard
Copied
i was tesing on POSTMAN
POST on https://pdf-services-ue1.adobe.io/operation/documentgeneration
replace the uri with the attached empty docx file (tested with AWS)
{
"input": {
"uri": "xxxx"
},
"params": {
"outputFormat": "pdf",
"jsonDataForMerge": {
"customerName": "Kane Miller",
"customerVisits": 100,
"itemsBought": [
{
"name": "Sprays",
"quantity": 50,
"amount": 100
},
{
"name": "Chemicals",
"quantity": 100,
"amount": 200
}
],
"totalAmount": 300,
"previousBalance": 50,
"lastThreeBillings": [
100,
200,
300
],
"photograph": ""
}
}
}
Copy link to clipboard
Copied
tested by changing the output :
"outputFormat": "docx",
same problem docx is working fine, not pdf
Copy link to clipboard
Copied
You mentioned the doc was empty. If you add -anything- to the Word doc template, does it work?
Copy link to clipboard
Copied
i tested it with my complete template in the first place. I just send to you and tested with an empty template to avoid any other interferance because it should work, and to show you that the problem is probably a conflict between adobe service api and a recent update of microsoft word.
My backup solution for the moment is to use google doc.
Copy link to clipboard
Copied
Problem can be reproduce on the demo API https://acrobatservices.adobe.com/dc-docgen-playground/index.html#/
can we have any update on the fix ?
Copy link to clipboard
Copied
I'm facing the same issue, except that I'm using the node sdk.
Copy link to clipboard
Copied
@Yashwanth36533953k2p5 can you share your Word doc and code?
Copy link to clipboard
Copied
import PDFServicesSdk from "@adobe/pdfservices-node-sdk";
import * as batcher from "./batcher.js";
export async function createPdfs(templatesComponents) {
const promises = [];
for (const templateComponents of templatesComponents) {
const { input, templatePath, outputPath } = templateComponents;
const job = async () => await createPdf(input, templatePath, outputPath);
const promise = batcher.pushJob(job);
promises.push(promise);
}
return await Promise.all(promises);
}
export async function combinePdfs(filePaths, outputPath) {
const job = async () => await _combinePdfs(filePaths, outputPath);
return await batcher.pushJob(job);
}
async function _combinePdfs(filePaths, outputPath) {
// Initial setup, create credentials instance.
const executionContext = getExecutionContext();
const combineFilesOperation =
PDFServicesSdk.CombineFiles.Operation.createNew();
// Set operation input from a source file.
for (const filePath of filePaths) {
const source = PDFServicesSdk.FileRef.createFromLocalFile(filePath);
combineFilesOperation.addInput(source);
}
// Execute the operation and Save the result to the specified location.
const result = await combineFilesOperation.execute(executionContext);
await result.saveAsFile(outputPath);
return outputPath;
}
async function createPdf(input, templatePath, outputPath) {
const executionContext = getExecutionContext();
// Create a new DocumentMerge options instance.
const documentMerge = PDFServicesSdk.DocumentMerge;
const documentMergeOptions = documentMerge.options;
const options = new documentMergeOptions.DocumentMergeOptions(
input,
documentMergeOptions.OutputFormat.PDF
);
// Create a new operation instance using the options instance.
const documentMergeOperation = documentMerge.Operation.createNew(options);
// Set operation input document template from a source file.
const template = PDFServicesSdk.FileRef.createFromLocalFile(templatePath);
documentMergeOperation.setInput(template);
// Execute the operation and Save the result to the specified location.
const result = await documentMergeOperation.execute(executionContext);
await result.saveAsFile(outputPath);
return outputPath;
}
function getExecutionContext() {
// Initial setup, create credentials instance.
const credentials =
PDFServicesSdk.Credentials.serviceAccountCredentialsBuilder()
.fromFile("credentials/pdfservices-api-credentials.json")
.build();
const clientConfig = PDFServicesSdk.ClientConfig.clientConfigBuilder()
.withConnectTimeout(20000)
.withReadTimeout(20000)
.build();
// Create an ExecutionContext using credentials.
return PDFServicesSdk.ExecutionContext.create(credentials, clientConfig);
}​
Copy link to clipboard
Copied
Nothing sticks out. I assume you can verify that "input" is data that is an object at the top level, right? What I mean is, the data sent to doc gen must be an object at the top level, not an array.
Copy link to clipboard
Copied
Hi @Raymond Camden, I think that the problem is the version of Word that is used to generate .docx template.
If I donwload samples in Adobe playground (https://acrobatservices.adobe.com/dc-docgen-playground/index.html#/) and use it without any change they work well.
But if I change anithing in Word template I receive the error :
- "code": "CORRUPT_DOCUMENT",
- "message": "The input file appears to be corrupted and cannot be processed.;
Attached I put the original file, the changed and Json that you can use to view this.
Copy link to clipboard
Copied
It seems that the problem reappears yes. If you need an empty docx to work on let met know
Copy link to clipboard
Copied
so people know a workaround is to not use Microsoft Word (Google Doc working fine) awaiting fix
Copy link to clipboard
Copied
I m ahvaing the same problem. I am using Java SDK and developed the full solution 4-5 weeks back for docx templates based PDF generation. I started working on it to prepare a demo for my leadership team and faced a road block. Any time I upload my new word docx template and generate PDF file after merging data, I get statusCode=400; errorCode=CORRUPT_DOCUMENT when polling for result. My old templates are still working fine but if I re-save the old template and generate PDF, I get the same CORRUPT_DOCUMENT error.
Amazingly, If I set outputFormt.DOCX, then I do not get any error. It does return a docs after mail merged data.
IF only happens when withOutputFormat(OutputFormat.PDF) is used.
The work around for me is to upload the doc to GoogleDoc, sane it as GoogleDoc and then download it as Word docx. But this process changes the template format when tables are used in text area.
Looking for guidance or fix so I cna prepare for demo with 4-5 templates and get approval for onboarding the service.
To re-produce this issue, one can us ethe sample MergeDocumentToPDF.java provided with Java SDK . if you use the template provided in the sampels, it works fine. When you re-save the temapltes using the words and run, you will get the same CORRUPT_DOCUMENT error.
Copy link to clipboard
Copied
I don't think that an empty docx can be a solution because I need to alter them to build the template.
Copy link to clipboard
Copied
I didn't know that the template can be build in Google Doc. Have any link to documentation about this? Exist on plugin to gdoc like to the MS word?
Copy link to clipboard
Copied
Any updates about this? I have the same issue with template docx & json file from the playground https://acrobatservices.adobe.com/dc-docgen-playground/index.html#/ . Here's my postman collection https://drive.google.com/file/d/1GWFzKj5-uzn2Ed4JlJY6oVS4dnIfyaGL/view?usp=sharing
Copy link to clipboard
Copied
I'm getting the same kind of problem.
New documents are generating ok, however some of my existing documents no longer generate. Even if I delete all content from the document it is still rejected as corrupted.
I've attached a minimal document example.
Unzipping the document the only thing thats obvious is that some 'media' has been retained even though it is not referenced in the content.

