Copy link to clipboard
Copied
Hi, is there a way to convert PDF to XML using the PdfTools SDK? Thanks
Copy link to clipboard
Copied
How do you plan to use the XML output? We have a text extraction API in beta that outputs structured content in JSON that you might be able to use.
Copy link to clipboard
Copied
My existing workflow takes a PDF that has a couple of tables. I convert the PDF into XLSX using the SDK and then process the excel to extract the data. The creator of the PDF has changed the data format and now the XLSX processing is breaking. I manually converted the PDF to XML format and that seemed to give me a more flexible structure for parsing data. Hence my interest in a way to convert PDF to XML. I can also work with JSON. Or perhaps, there is a simpler workflow you can recommend. Thanks