Skip to main content
WolfShade
Legend
April 12, 2011
Question

CF function/UDF to clean Word-generated HTML?

  • April 12, 2011
  • 1 reply
  • 2277 views

Hello, everyone.

Is there, out there somewhere, a CF function or UDF that will take a string of Word-generated HTML and remove all the cruft from it?

I'm using TinyMCE as the rich-text editor for a CMS, and the people who are creating/modifying the content are using Word - and when they paste into the TinyMCE editor, most of the time there is no problem.  However, every once in a while something gets pasted that, no matter what one does, entire paragraphs will come out in bold (even though they weren't bolded in Word) and other stuff.. I'd like to clean that out BEFORE it goes to the database.

Thanks,

^_^

    This topic has been closed for replies.

    1 reply

    WolfShade
    WolfShadeAuthor
    Legend
    April 13, 2011

    Anyone?

    Inspiring
    April 13, 2011

    Have you looked @ cflib.org to see if someone's already done this?  Or googled "coldfusion function remove word html"?

    Is it just the dodgy Microsoft stuff you want to get rid of, rather than all mark-up?

    You might want to look @ JTidy or something like that.

    --

    Adam

    Participating Frequently
    April 13, 2011

    Try this too: http://cflib.org/udf/demoronize