Inspiring

Question

cffile and UTF-8

Forum|Forum|17 years ago
March 25, 2009
16 replies
4720 views

Hello Community!

I have a program that uploads a file to a remote FTP server. I am using cffile to write the file there and it MUST be uploaded in UTF-8 format. Despite that, the file is being uploaded as ascii or ansi, anything except UTF-8.

This is my line of code:
<cffile action="write" file="#f_dir##f_name#" output="#dataHeader#" charset="utf-8">

charset="utf-8" is not working for me.

Does anybody else have the same problem? Any thoughts?

Thanks!

Ysais.

Advanced techniques

This topic has been closed for replies.

A

apocalipsis19Author

Inspiring

Ok, I just sent them to you.

A

apocalipsis19Author

Inspiring

Paul,

You said here:

"don't use either "

I think that the party I am sending this file to is just opening the file in Notepad ++ and when he sees it says ANSI there he requests a different file. This file is supposed to be processed in their servers but I haven't got any output from the processing software just the feedback from this guy that administers the servers.

This turned out to be a big project in time terms for me.

Thanks a lot!

N

Newsgroup_User

Inspiring

apocalipsis19 wrote:
> The file should just be UTF-8. That would solve my problem.

again, can i see a zipped up version before & after uploading?

A

apocalipsis19Author

Inspiring

Sure! How do I send it to you?

A

apocalipsis19Author

Inspiring

Thanks Paul!

The file should just be UTF-8. That would solve my problem.

I am just opening the file in those text editors to see the encoding of the file.

N

Newsgroup_User

Inspiring

Mack wrote:
> I found this java bug that is related to the problem. It's about reading
> UTF-8 files with BOM but if it's not transparent on read I doubt it's
> tranparent on write:
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4508058

and sun marked that "bug" as "Closed, Will Not Fix". sun's not going to fix
something that it considers not "broken" (there are also "bugs" related to java
not compiling source with a BOM as well) or that will create backwards
compatibility problems--a BOM is optional for utf-8 (and pretty much useless in
utf-8 anyway) but required for utf-16 which java handles ok (if i remember rightly).

and just an FYI, sun usually gives i18n bugs short shrift. some locale resource
bugs (and i mean real bugs like stuff where the get currency/numeric formatting
dead wrong) have been around for >5 years.

N

Newsgroup_User

Inspiring

apocalipsis19 wrote:
> Well,
>
> I have done further research on this issue and all of my code is correct. The
> problem is the underlying JVM. It does nor properly support adding the Byte
> Order Mark to a UTF-8 file. Some people suggest adding the file through Java
> code inside the cfscript tags.
>
> I will look into deeper into this and I continue to appreciate any ideas you
> guys give me!

I found this java bug that is related to the problem. It's about reading
UTF-8 files with BOM but if it's not transparent on read I doubt it's
tranparent on write:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4508058

--
Mack

N

Newsgroup_User

Inspiring

apocalipsis19 wrote:
> I have done further research on this issue and all of my code is correct. The
> problem is the underlying JVM. It does nor properly support adding the Byte
> Order Mark to a UTF-8 file. Some people suggest adding the file through Java

a BOM is *optional* for utf-8 by definition (and if you read the definition
you'll see why it's also pretty much un-needed). is the app on the other end
expecting a BOM?

> code inside the cfscript tags.

if your research is correct about the JVM & BOM writing (i think not, it's
optional so the app should handle writing it to a new file), then it's six of
one, half dozen of the other.

what is the app on the other end expecting *exactly*? can you put up the before
& after data (zipped up to preserve encoding)?

A

apocalipsis19Author

Inspiring

Paul,

The application on the other end is expecting a file UTF-8 encoded. What really troubled me at first is that when I opened the file with EditPlus it said that the file was UTF-8 but when I opened the file with Notepad ++ it said that it was ANSI. My charset attribute is set to UTF-8 in my cffile tags. The transferMode attribute in the cfftp tag is set to BINARY. I will continue submitting the file until I fix this problem.

Mack,

Thanks for the link, I am looking into that. I will post in here whatever happens for future reference or other fellows' reference.

If you guys come up with something else I will be more than happy to read about it.

Thanks!

Ysais.

A

apocalipsis19Author

Inspiring

Well,

I have done further research on this issue and all of my code is correct. The problem is the underlying JVM. It does nor properly support adding the Byte Order Mark to a UTF-8 file. Some people suggest adding the file through Java code inside the cfscript tags.

I will look into deeper into this and I continue to appreciate any ideas you guys give me!

Thanks!

Ysais.

N

Newsgroup_User

Inspiring

apocalipsis19 wrote:
> My problem still persists.

I think you have only 2 steps CFFILE and CFFTP. I'd check after each
step if the file is *really* UTF-8 reducing the problem in half.

--
Mack

A

apocalipsis19Author

Inspiring

My problem still persists.

Show more replies

Sign up

To post, reply, or follow discussions, please sign in with your Adobe ID.

Sign in to Adobe Community

To post, reply, or follow discussions, please sign in with your Adobe ID.

Scanning file for viruses.

This file cannot be downloaded