Skip to main content
Participant
October 21, 2020
Question

Export PDF to XML using Node.js SDK

  • October 21, 2020
  • 1 reply
  • 1523 views

Hi, is there a way to convert PDF to XML using the PdfTools SDK? Thanks

    This topic has been closed for replies.

    1 reply

    Joel Geraci
    Community Expert
    Community Expert
    October 21, 2020

    How do you plan to use the XML output? We have a text extraction API in beta that outputs structured content in JSON that you might be able to use.

    Participant
    October 21, 2020

    My existing workflow takes a PDF that has a couple of tables. I convert the PDF into XLSX using the SDK and then process the excel to extract the data. The creator of the PDF has changed the data format and now the XLSX processing is breaking. I manually converted the PDF to XML format and that seemed to give me a more flexible structure for parsing data. Hence my interest in a way to convert PDF to XML. I can also work with JSON. Or perhaps, there is a simpler workflow you can recommend. Thanks