Adobe Document Services performance issues
Copy link to clipboard
Copied
Hi all,
I'm exploring Adobe Document Services as a template based option for generating PDF reports. So far, I've been very happy with the template creation plugin and how it all comes together with
documentMergeOperation.Execute(executionContext);
is taking 5-10 seconds when executed from an extremely light .NET core console application. I wouldn't consider the DTO for this report to be complex enough to merit the execution time: 3 nested objects, 1 image base 64 encoded, 1 collection with 3 elements, and a dozen more string/DateTime properties. The template is basically just a single page with mostly text tags, 1 image, and a table.
Our .NET Core application is running document services from an installed package, rather than through Adobe Cloud Services. We are not able to send our DTOs to ACS due to security requirements. Currently, we run documentMergeOperation.Execute on a single DTO + template (we're in the proof of concept stage). I'm planning to explore runtime on different DTO's, templates, and # of reports produced, but haven't yet.
I'm assuming that Execute awaits credential verification via web request, and maybe that's where some of this long runtime is coming from. I don't see any way to cache or re-use this verification between requests, but maybe that happens automatically under the API's hood.
Has anyone else dealt with runtime issues around the Document Services API? Does anyone have any suggestions for how we might improve runtime? 5-10 seconds runtime per doc will make scaling a PDF generation service difficult.
I appreciate any help or similar experiences!
Max
Copy link to clipboard
Copied
Creating a PDF file from Word running on a local machine can take that long. Just to verify that it's the PDF creation from the final Word file rather than the merge that's taking the time, try setting the output to .docx and see if the time is reduced. Please let me know your results.
Copy link to clipboard
Copied
Thanks for the suggestion Joel!
I've got two reports now and they both generate in roughly the same time. Both reports are based on docx templates but have significant differences in content. Both reports generate to PDF in about 8 seconds and docx in 3 seconds, so I think that lends to your theory. Generation time seems to be pretty consistent between runs, but I haven't dug into programmatic benchmarking yet.
Copy link to clipboard
Copied
Ok - Thanks for confirming. The PDF creation service does a lot more than just render the PDF. It adds bookmarks based on Word styles, adds structure tags for accessibility, reuse, and table extraction, subsets fonts, and optimizes the file for faster web viewing, and more. These are non-trivial and process-intensive tasks but you get a better PDF as output.
Copy link to clipboard
Copied
That makes sense, I imagine there's a lot of stuff going on under the hood that I wouldn't expect. Do you have any recommendations for how to scale or optimize the PDF creation service given that we need to keep it running locally on our machines?
Could any of the template -> pdf processing be cached and re-used between API calls?
Would you expect different performance coming from a HTML based template?
Do you have any suggestions or heard from any clients about how they managed scaling this API to handle a throughput that exceeds generation time?
Copy link to clipboard
Copied
Can you elaborate on this statement?
"Do you have any recommendations for how to scale or optimize the PDF creation service given that we need to keep it running locally on our machines?'
What exactly do you mean by "keep it running locally"?
Copy link to clipboard
Copied
I meant to say on premise. My understanding is that by downloading the Adobe DocumentServices nuget package and including it in our application, our data would stay on premise, even if there was an external call to verify our Adobe token. Would you please verify that assumption?
Our plan is to deploy a service / API for generating PDF / xdoc reports on premise. This service would run on our own machines (as any sort of cloud solution has the same problems with data security). The service would be exposed to other applications internally (it would not be public facing), take data as a JObject, determine the appropriate template file, and send back some representation of the generated report.
Copy link to clipboard
Copied
Ah... ok. I'm so glad I asked. The PDF Tools API is a cloud service. The SDKs are local but call out to Adobe's servers for processing. I'm sorry that was not clear.
Copy link to clipboard
Copied
That's disappointing, but we're really glad to learn that at the current stage of development, so thank you for clarifying. Do you have any cloud services for generating PDFs that have been accepted or are in use by the US government for dealing with Veteran Affairs PIA (including SSN, other personally identifiable information, and healthcare/medical information)?
Copy link to clipboard
Copied
You might want to look at the PDF Generator product. You can install it on-prem.
Copy link to clipboard
Copied
That seems promising, thanks for the suggestion. So using PDF Generator, we would somehow handle merging the docx template with its corresponding data, then feed the resulting merged docx file into a pdf generator service deployed on-prem and at that point we would have roughly the same output as PDF Creation Services, just handled fully on-prem.
Would you be able to give me an idea of how much additional complexity is involved in reaching a similar quality of the generated PDF in the above naive approach, compared to using the PDF creation service? I'm trying to understand what our options are and how they compare to eachother.
Copy link to clipboard
Copied
Honestly, I have no expertise in that product. If you've read the marketing material, you now know as much as I do.

