Skip to main content
Inspiring
November 18, 2010
Question

Grabbing part of another html document

  • November 18, 2010
  • 1 reply
  • 855 views

I have a page that lists the titles and teasers from other pages on the site. The title is always wrapped in the only h1 tag on the page, and the teaser is wrapped in the only h2 tag on the page, so they're easily identified.

How can I pull those into a CF page so that I have this kind of list:

Article1 Title

Article 1 Teaser

Article 2 Title

Article2 Teaser

Etc.

??

    This topic has been closed for replies.

    1 reply

    Inspiring
    November 18, 2010

    1. You can use the CFHTTP tag the read the content of another web page.

    2. You can use regular expressions to find the content of the h1 and h2 tags.  See the GetHtmlTitle function on the cflib.org as an example to get you started.

    http://cflib.org/udf/GetHTMLTitle

    Squiggy2Author
    Inspiring
    November 18, 2010

    Thanks JR! What would be better... to try that approach or try and use the supplied XML file? I haven't worked with XML documents before, so there's a greater learning curve. Any good tutorials out there for learning how to bring xml data into the page (again, just the title and teaser portion. On the "read more" page, I'm bringing in the html page by changing it to a cfm file and using a cfinclude.

    Inspiring
    November 18, 2010
    What would be better... to try that approach or try and use the supplied XML file?

    I would try both and use the one that works best for you. I generally prefer working with a structured format like XML documents, however if your HTML is consistently formatted using a regular expression may also be a valid approach.

    Any good tutorials out there for learning how to bring xml data into the page

    Take a look at the CF documentation.

    http://help.adobe.com/en_US/ColdFusion/9.0/Developing/WSc3ff6d0ea77859461172e0811cbec22c24-7fb3.html

    Ben Forta's books are also a good resource for learning CF and include chapters on XML.
    http://www.forta.com/books/0321679199/