Skip to main content
Participant
February 15, 2024
Answered

Extendscript to Parse XML with Doctype data

  • February 15, 2024
  • 2 replies
  • 398 views

Hi, 

Not able to Parse the sample XML using below script. it will be in contineous loop. Need help on this. 

Script File:

var file = new File(Folder(File($.fileName).path).fsName+"//Sample.xml");
file.encoding ="UTF-8";
file.open("r");
XML.ignoreComments = false;
XML.ignoreProcessingInstructions = false;
var xml = new XML (file.read());
newfile.close();

Sample XML File:

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE subtask [
<!ENTITY lt "&#38;">
<!ENTITY gt "&#62;">
<!ENTITY amp "&#38;">
]>

<subtask>
<title>General Information</title>
<prcitem1>
<prcitem>
<para>This document provides &lt; repair instructions &amp; for the Fire Extinguisher components.</para>
</prcitem>
</prcitem1>
</subtask>

    This topic has been closed for replies.
    Correct answer frameexpert

    If you use an identity transform with XSLT, it will strip the doctype and entity declarations, which will allow you to load the file as an ExtendScript XML object. FrameMaker's XSLT engines can be invoked with ExtendScript. I learned how to do this from Jang Graat's excellent blog post:

     

    https://blog.adobe.com/en/publish/2017/11/21/xslt-support-in-framemaker-2017

     

    I can help you with code specifics, but I don't have the time to do it for free. Please let me know if you need further help. Here is an identity transform that I tested with your sample.xml. It strips the doctype and loads fine in ExtendScript.

     

    <?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xmlns:xs="http://www.w3.org/2001/XMLSchema"
        xmlns:math="http://www.w3.org/2005/xpath-functions/math"
        exclude-result-prefixes="xs math"
        version="3.0" expand-text="yes">
        
        <xsl:output indent="yes"/>
        
        <xsl:template match="/">
            <xsl:apply-templates/>
        </xsl:template>
        
        <xsl:mode on-no-match="shallow-copy"/>
        
    </xsl:stylesheet>

     

    2 replies

    frameexpert
    Community Expert
    Community Expert
    February 15, 2024

    I tried some of my code too and it fails. For some reason, it doesn't like the internal doctype and entity declarations. When I remove these it works fine. I have an idea that might work, but am pressed for time right now. I will try to follow up later. Thanks.

    Participant
    February 16, 2024

    Thank you for your time and the response... i am using FM 2017 version and Extendscript Toolkit CC to debug and run the script... The script works fine if i remove Doctype declarations, but i am in search of a solution where i can parse XML with Doctype declarations...

    frameexpert
    Community Expert
    frameexpertCommunity ExpertCorrect answer
    Community Expert
    February 16, 2024

    If you use an identity transform with XSLT, it will strip the doctype and entity declarations, which will allow you to load the file as an ExtendScript XML object. FrameMaker's XSLT engines can be invoked with ExtendScript. I learned how to do this from Jang Graat's excellent blog post:

     

    https://blog.adobe.com/en/publish/2017/11/21/xslt-support-in-framemaker-2017

     

    I can help you with code specifics, but I don't have the time to do it for free. Please let me know if you need further help. Here is an identity transform that I tested with your sample.xml. It strips the doctype and loads fine in ExtendScript.

     

    <?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xmlns:xs="http://www.w3.org/2001/XMLSchema"
        xmlns:math="http://www.w3.org/2005/xpath-functions/math"
        exclude-result-prefixes="xs math"
        version="3.0" expand-text="yes">
        
        <xsl:output indent="yes"/>
        
        <xsl:template match="/">
            <xsl:apply-templates/>
        </xsl:template>
        
        <xsl:mode on-no-match="shallow-copy"/>
        
    </xsl:stylesheet>

     

    frameexpert
    Community Expert
    Community Expert
    February 15, 2024

    Are you running this from inside of FrameMaker? What version of FrameMaker are you using?