Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

Decode special characters, such as ö

New Here ,
Aug 21, 2006 Aug 21, 2006
Hi,
I have a custom tag that is being passed a value for an attribute that is being encoded. For example the umlaut is being escaped in as ö. I need to take this text and output as a string without esacped characters.

Attached is an example that hopefully will clarify.

Is there a way to get the special characters to decode as text?

TIA,
Dave
2.5K
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

LEGEND , Aug 21, 2006 Aug 21, 2006
davidsatz wrote:
> I need an easy way to transform the html entities into the real chars instead

here's the dirty work. you'll need to find the entities via regex, strip out the
"&" and ";" bits, take what's left & look up in the entityMap & finally
replace w/the unicode code point that's returned.

<cfscript>
function createEntityMap() {
/*
author paul hastings
date 22-aug-2006
note maps HTML entities to unicode code points
HTML entity data derived from Roedy Green's entities.java found at...
Translate
Guest
Aug 21, 2006 Aug 21, 2006
Try URLDecode:
http://www.techfeed.net/cfQuickDocs/?getDoc=URLDecode

*** Edit: ***
There is no undo for HTMLEditFormat() which is likely how the &ouml got there.

http://www.houseoffusion.com/groups/CF-Talk/thread.cfm/threadid:36235
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Aug 21, 2006 Aug 21, 2006
I have tried that and every other built-in formatting CFML function I can find
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Aug 21, 2006 Aug 21, 2006
davidsatz wrote:
> Hi,
> I have a custom tag that is being passed a value for an attribute that is
> being encoded. For example the umlaut is being escaped in as &ouml;. I
> need to take this text and output as a string without esacped characters.

why are you using html entities instead of the real chars?

> Attached is an example that hopefully will clarify.

it doesn't. what do you want? the real chars instead of the html entities or the
html entity simply removed?

> Is there a way to get the special characters to decode as text?

don't use them in the first place.
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Aug 21, 2006 Aug 21, 2006
hi - there is another xml/xsl application that is producing the CFM with html entities. I cannot change that app, so I am hoping to fix its affects on my meta data.

dave
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Aug 21, 2006 Aug 21, 2006
davidsatz wrote:
> hi - there is another xml/xsl application that is producing the CFM with html
> entities. I cannot change that app, so I am hoping to fix its affects on my
> meta data.

"what do you want? the real chars instead of the html entities or the
html entity simply removed?"
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Aug 21, 2006 Aug 21, 2006
I need an easy way to transform the html entities into the real chars instead
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Aug 21, 2006 Aug 21, 2006
davidsatz wrote:
> I need an easy way to transform the html entities into the real chars instead

here's the dirty work. you'll need to find the entities via regex, strip out the
"&" and ";" bits, take what's left & look up in the entityMap & finally
replace w/the unicode code point that's returned.

<cfscript>
function createEntityMap() {
/*
author paul hastings
date 22-aug-2006
note maps HTML entities to unicode code points
HTML entity data derived from Roedy Green's entities.java found at
http://mindprod.com/products1.html#ENTITIES
*/
var entities=structNew();
entities["le"]=8804;
entities["Yacute"]=253;
entities["cup"]=8746;
entities["sim"]=8764;
entities["real"]=8476;
entities["sub"]=8834;
entities["gt"]=62;
entities["lfloor"]=8970;
entities["ordf"]=170;
entities["sup"]=8835;
entities["otimes"]=8855;
entities["Ouml"]=246;
entities["sube"]=8838;
entities["Sigma"]=963;
entities["reg"]=174;
entities["Beta"]=946;
entities["oplus"]=8853;
entities["Pi"]=960;
entities["ETH"]=240;
entities["rfloor"]=8971;
entities["shy"]=173;
entities["Oslash"]=248;
entities["Otilde"]=245;
entities["ang"]=8736;
entities["trade"]=8482;
entities["fnof"]=402;
entities["Chi"]=967;
entities["upsih"]=978;
entities["frac12"]=189;
entities["rlm"]=8207;
entities["Eacute"]=233;
entities["permil"]=8240;
entities["hearts"]=9829;
entities["Icirc"]=238;
entities["cent"]=162;
entities["AElig"]=230;
entities["Psi"]=968;
entities["sum"]=8721;
entities["divide"]=247;
entities["iquest"]=191;
entities["Ecirc"]=234;
entities["ensp"]=8194;
entities["empty"]=8709;
entities["forall"]=8704;
entities["emsp"]=8195;
entities["Gamma"]=947;
entities["lceil"]=8968;
entities["dagger"]=8225;
entities["not"]=172;
entities["equiv"]=8801;
entities["Acirc"]=226;
entities["Agrave"]=224;
entities["Eta"]=951;
entities["alefsym"]=8501;
entities["ordm"]=186;
entities["piv"]=982;
entities["bdquo"]=8222;
entities["Delta"]=948;
entities["or"]=8744;
entities["acute"]=180;
entities["deg"]=176;
entities["cong"]=8773;
entities["Ntilde"]=241;
entities["lsaquo"]=8249;
entities["clubs"]=9827;
entities["hellip"]=8230;
entities["Ograve"]=242;
entities["Iuml"]=239;
entities["diams"]=9830;
entities["cedil"]=184;
entities["amp"]=38;
entities["Alpha"]=945;
entities["Egrave"]=232;
entities["darr"]=8659;
entities["and"]=8743;
entities["nsub"]=8836;
entities["ne"]=8800;
entities["Epsilon"]=949;
entities["isin"]=8712;
entities["Ccedil"]=231;
entities["lsquo"]=8216;
entities["copy"]=169;
entities["Aacute"]=225;
entities["Theta"]=952;
entities["mdash"]=8212;
entities["Euml"]=235;
entities["Kappa"]=954;
entities["notin"]=8713;
entities["iexcl"]=161;
entities["ge"]=8805;
entities["Igrave"]=236;
entities["harr"]=8660;
entities["lowast"]=8727;
entities["Ocirc"]=244;
entities["infin"]=8734;
entities["brvbar"]=166;
entities["int"]=8747;
entities["macr"]=175;
entities["frac34"]=190;
entities["curren"]=164;
entities["asymp"]=8776;
entities["Lambda"]=955;
entities["frasl"]=8260;
entities["circ"]=710;
entities["crarr"]=8629;
entities["OElig"]=339;
entities["image"]=8465;
entities["there4"]=8756;
entities["lt"]=60;
entities["minus"]=8722;
entities["Atilde"]=227;
entities["ldquo"]=8220;
entities["nabla"]=8711;
entities["exist"]=8707;
entities["Auml"]=228;
entities["Mu"]=956;
entities["frac14"]=188;
entities["nbsp"]=160;
entities["Oacute"]=243;
entities["bull"]=8226;
entities["larr"]=8656;
entities["laquo"]=171;
entities["oline"]=8254;
entities["ndash"]=8211;
entities["euro"]=8364;
entities["micro"]=181;
entities["Nu"]=957;
entities["cap"]=8745;
entities["Aring"]=229;
entities["Omicron"]=959;
entities["Iacute"]=237;
entities["perp"]=8869;
entities["para"]=182;
entities["rarr"]=8658;
entities["raquo"]=187;
entities["Ucirc"]=251;
entities["Iota"]=953;
entities["sbquo"]=8218;
entities["loz"]=9674;
entities["thetasym"]=977;
entities["ni"]=8715;
entities["part"]=8706;
entities["rdquo"]=8221;
entities["weierp"]=8472;
entities["sup1"]=185;
entities["sup2"]=178;
entities["Uacute"]=250;
entities["sdot"]=8901;
entities["Scaron"]=353;
entities["yen"]=165;
entities["Xi"]=958;
entities["plusmn"]=177;
entities["yuml"]=376;
entities["THORN"]=254;
entities["rang"]=9002;
entities["Ugrave"]=249;
entities["radic"]=8730;
entities["zwj"]=8205;
entities["tilde"]=732;
entities["uarr"]=8657;
entities["times"]=215;
entities["thinsp"]=8201;
entities["sect"]=167;
entities["rceil"]=8969;
entities["szlig"]=223;
entities["supe"]=8839;
entities["Uuml"]=252;
entities["rsquo"]=8217;
entities["Zeta"]=950;
entities["Rho"]=961;
entities["lrm"]=8206;
entities["Phi"]=966;
entities["zwnj"]=8204;
entities["lang"]=9001;
entities["pound"]=163;
entities["sigmaf"]=962;
entities["uml"]=168;
entities["prop"]=8733;
entities["Upsilon"]=965;
entities["Omega"]=969;
entities["middot"]=183;
entities["Tau"]=964;
entities["sup3"]=179;
entities["rsaquo"]=8250;
entities["prod"]=8719;
entities["quot"]=34;
entities["prime"]=8243;
entities["spades"]=9824;
return entities;
}
entityMap=createEntityMap();
cent=structFind(entityMap,"cent");
writeoutput("#cent# #chr(cent)#");
</cfscript>
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Aug 22, 2006 Aug 22, 2006
Paul - thank you so much for this function. I was hoping that there would be an easy way to "fix" this in ColdFusion. Now I know there a way that is not easy or necessarily something I want to implement in a custom tag that is called by every page on our site. This will be our fallback strategy if the vendor cannot fix the issues with the XSL for certain Germanic characters.
Dave
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Aug 23, 2006 Aug 23, 2006
davidsatz wrote:
> an easy way to "fix" this in ColdFusion. Now I know there a way that is not
> easy or necessarily something I want to implement in a custom tag that is
> called by every page on our site. This will be our fallback strategy if the
> vendor cannot fix the issues with the XSL for certain Germanic characters.

yes not using the darned things in the first place would be best but this way's
not that bad. wrap the whole mess in a CFC & stuff it into the app scope. it
should run fairly fast.
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Aug 23, 2006 Aug 23, 2006

The solution provided is a great solution.

But is there any reason why no one suggested creating their own function
to undo what HtmlEditFormat() does?

Meaning, why not just create an HtmlUnEditFormat() function for un-doing?

For example:

<cffunction name="HtmlUnEditFormat" access="public" returntype="string" output="no">
<cfargument name="str" type="string" required="Yes" />

<!--- add more as needed --->
<cfreturn ReplaceList(arguments.str, " ,&lt;,&gt;", " ,<,>") />
</cffunction>


Good luck!
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Aug 23, 2006 Aug 23, 2006
I was able to combine your two suggestions and create a nice little function to do this



Thanks
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Aug 23, 2006 Aug 23, 2006
LATEST
<newbie /> wrote:
> But is there any reason why no one suggested creating their own function
> to undo what HtmlEditFormat() does?

good idea.
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources