Skip to main content
linGLIngMaTE
Participant
June 23, 2020
Question

built-in function (or a third party plugin/library) that can convert/parse some text into DOM?

  • June 23, 2020
  • 3 replies
  • 262 views

as above. at the moment, I'm using the Jsoup Java library. I'm just trying to see if there's anything written in CF out there. Thanks!

    This topic has been closed for replies.

    3 replies

    BKBK
    Community Expert
    Community Expert
    June 27, 2020

    Even if there was an HTML parser written in ColdFusion out there, it wouldn't be as efficient as Jsoup. Underneath, ColdFusion would have to convert the CFML code to Java anyway.

     

    A Stackoverflow comparison of Java HTML parsers over 10 years ago is still useful.

     

     

    Charlie Arehart
    Community Expert
    Community Expert
    June 23, 2020

    I'd say jsoup is your best bet, and it's tbe bomb. Are you leveraging it from within CFML? You can. See resources such as https://tonyjunkes.com/blog/crash-course-in-cfml-and-jsoup/ .

     

    More than showing merely how to integrate it into CFML, is also a great general introduction to jsoup, for any seeing this who are not familiar. Another good general intro is the  https://jsoup.org/cookbook/.

     

    And in case the idea gets anyone's juices going, I'll share what was the best tip for me when I first started using jsoup: the idea of using the browser dev tools "inspect" feature to help find the "selector" value to use when trying to process some web content using jsoup: https://www.javacodeexamples.com/jsoup-get-css-selector-for-any-dom-element-example/841

     

    But yep, Jay, if you somehow really wanted to find something other than jsoup, let us know what you're looking for (I can't imagine much OUTSIDE of CF let alone anything INSIDE of it that would rival jsoup's power).

    /Charlie (troubleshooter, carehart. org)
    WolfShade
    Legend
    June 23, 2020

    I'd love to help, but don't fully understand your question.  What do you mean by "convert/parse some text into DOM"?  Converting or parsing text into a Document Object Model???

     

    V/r,

     

    ^ _ ^