Skip to main content
Known Participant
August 27, 2010
Question

UTF-8 page encoding is not coming out true

  • August 27, 2010
  • 1 reply
  • 3066 views

Hello, I'm trying to get  my site up to speed with UTF-8 and it's really giving me a heck of a  time doing so. I've checked and double-checked and everything that I can  see appears to say the file is to be decoded as UTF-8.


  • Inside my text editor, CFEclipse, the file encoding is set to UTF-8.
  • In application.cfc, I have
    <cfcomponent output="false">
        <!--- Force encoding (don't trust default) --->

        <!--- Note to adobe forum-goers, this is not inside a cffunction, only inside the cfcomponent tag. --->

        <cfprocessingdirective pageencoding="utf-8">

        <cfcontent type="text/html; charset=UTF-8">

        <cfset SetEncoding("URL", "UTF-8")>

        <cfset SetEncoding("Form", "UTF-8")>

        ...

    </cfcomponent>
  • In each page it uses valid HTML5, like this:
    <!DOCTYPE html>

    <html>

    <head>

        <meta charset="UTF-8">

Even so, smart quotes end up looking like this: “ and � and other characters like em dashes and accented characters share a similar fate.

I checked the page properties (right click→page info) , and it says the page is in UTF-8. I've pasted this bit of code into the page, and it outputs utf-8: <cfset theEncoding = getEncoding("URL")><cfoutput>#theEncoding#</cfoutput>

What more could possibly need to be done for this to be output by the browser as true UTF-8?

    This topic has been closed for replies.

    1 reply

    Inspiring
    August 28, 2010

    uh, the database & its driver?

    btw unless your CFC are really outputting something teh cfcontent, etc isn't needed.

    Known Participant
    August 30, 2010

    These are pages that aren't running through a database, though.

    Here is another nifty little bit that I found out that might clear some things up a bit and get to the source of the problem.

    If I put <cfprocessingdirective pageEncoding="utf-8"> at the top of my CFM pages, everything works out. The special characters show up fine. This feels like a pretty big kludge, though. I don't want to have to put this at the top of every single one of my pages. I'm told the server should be able to handle it all without relying on a page-by-page encoding declaration.

    So, why would application.cfc not be rendering the page corretly where cfprocessingdirective does?

    Inspiring
    August 30, 2010

    actually using cfprocessingdirective is considered a good practice. if you're

    not using this tag & you're getting mojibake, then for sure your text isn't

    properly/fully encoded as UTF-8. either as plain text or from your db.

    or perhaps the server's default encoding (UTF-8) been changed?

    if some of your text is being copied from word (the smart quotes say the answer

    to this is probably "yes") then a browser often can't figure out the page's

    encoding on it's own (latin-1 text with a sprinkling of word's "funny" chars

    which could be unicode). including the cfprocessingdirective tag forces cf to

    use that encoding. this is a compile time thing.

    you can gain the same thing by using a BOM but for utf-8 it's entirely optional

    (and pretty much useless for it's intended purpose, by definition utf-8 only has

    the one order). and some editors don't like it (for instance eclipse from it's

    java roots i guess).