Unable to import XML data into PDF using Python

Report · Dec 18, 2018

I'm trying to use win32com and pandas modules in Python to try to open a PDF Template file, import an XML file and save the new form-filled PDF file.

This is my code:

import win32com.client, win32com.client.makepy, os, winerror, pandas as pd, errno, re
from win32com.client.dynamic import ERRORS_BAD_CONTEXT
xml_file = "mq_xml.xml"
ERRORS_BAD_CONTEXT.append(winerror.E_NOTIMPL)
src = os.path.abspath('mq_template.pdf')
win32com.client.makepy.GenerateFromTypeLibSpec('Acrobat')
adobe = win32com.client.DispatchEx('AcroExch.App')
avDoc = win32com.client.DispatchEx('AcroExch.AVDoc')
avDoc.Open(src, src)
pdDoc = avDoc.GetPDDoc()
jObject = pdDoc.GetJSObject()
jObject.importDataObject("MyData", "mq_xml.xml");
jObject.SaveAs("mq_final.pdf", "com.adobe.acrobat.xml-1-00")

However, when I run this I get:

Traceback (most recent call last):
File "parser.py", line 20, in <module>
jObject.SaveAs("mq_final.pdf", "com.adobe.acrobat.xml-1-00")
File "<COMObject GetJSObject>", line 2, in SaveAs
pywintypes.com_error: (-2147352567, 'Exception occurred.', (1001, 'Acrobat JavaS
cript', 'NotAllowedError: Security settings prevent access to this property or m
ethod.', None, 0, 0), None)

I have successfully managed to open a PDF file and save it as an Excel file, but that's about it. I was not able to do anything beyond that, and I'm not able to figure out what the issue is and how to achieve what I'm looking for.

Using Windows, Python 3 and Adobe Acrobat XI Pro.

Report · Dec 18, 2018

ImportDataObject doesn’t do what your code suggests you think it does. Read the docs.

Why are you passing that parameter to saveAs? Read the docs.

Is this for a server? (please answer this critical question before you do more research).

Report · Dec 18, 2018

Thanks for your reply,

I've been trying to find the right way to do this in tons of documentation pages, but I couldn't find something relevant (or that I could understand). The problem might be that I'm using Python, and the conversion might not be very easy to achieve.

Another function I've seen is

xfa.data.loadXML("file.xml")

but I'm not sure how to apply this to my Python script (I can't find any documentation for Python).

And no, this is not for a server. I'm trying to run the script on my personal computer.

Report · Dec 18, 2018

That function is for loading the text representation of XML into an XFA model. Is this an XFA(i.e. LiveCycle) form?

What you want are the data import functions. doc.importAnXFDF, or doc.importXFAData

Here's a reference entry:

Acrobat DC SDK Documentation

Of course, your XML will have to match the grammar associated with these functions XFA or XFDF. The only way to import the straight XML data format is through the doc.submitForm() function, which requires a server.

Thom Parker - Software Developer at PDFScripting
Use the Acrobat JavaScript Reference early and often

Unable to import XML data into PDF using Python

Photos