How the SDK handles unicode

Question

I spent several painful hours learning the following about how the SDK handles unicode characters -- perhaps I've missed where this is documented? Here's what I learned:

- Lua strings are sequences of 8-bit characters (bytes).

- A unicode ZString is represented as a Lua string containing the UTF-8 encoding of the unicode ZString. For example, the trademark character (TM) is unicode codepoint 2122 (hex), and the ZString LOC "$$$/unicode/tm=^U+2122" is represented as a Lua string of length three, the UTF-8 encoding of that character (decimal bytes 226 132 162).

- A posting from Adobe employee "escouten" last year said that all SDK APIs treat all Lua strings as UTF-8 encoding of unicode strings. I've personally observed that with LrView, LrFileUtils, and LrTasks.execute, but haven't checked other APIs. In particular, a Windows unicode filename will be returned by LrFilteUtils as a Lua string encoding the the filename in UTF-8. Passing that filename in a command line to LrTasks.execute works correctly. (But writing a Windows batch file with a UTF-8 filename won't in general work -- a topic for another day.)

DonCristobal · Answer

Hi John,I'm struggling a little with this UTF-8 topic currently. I can sympathize with your several painful hours now. :-)1) Can you (or somebody else) reproduce the following issue: (Win 8.1. LR 5.6)If your photos are stored in a UTF-8 encoded directory such as c:\users\username\Pictøäöüש (the last letter being the Hebrew letter shin). (This is kind of my test case after users from Norway and Israel reported problems.) local picName = selectedPhoto:getRawMetadata ("path") outputToLog (picName)I get the wrong result:C:\Users\username\Pictøäöüש\7L6B7931.CR2 If I use, on the other hand, getFormattedMetadata: outputToLog (selectedPhoto:getFormattedMetadata ("folderName") .. " and " .. selectedPhoto:getFormattedMetadata ("fileName")) I get a correct result (but not the full pathname)Pictøäöüש and 7L6B7931.CR2 Going from there, I could probably figure out the full path name (which does not seem to be offered in getFormattedMetadata), but I would like to figure out what's wrong with selectedPhoto:getRawMetadata ("path").2) The following is more for reference: I cannot seem to pass previews.db path name to sqlite if the path of the previews.db (LR catalog path) contains non-ASCII utf-8 characters. (Other UTF8 commands on the command line work well.) chcp 65001 doesn't help. sqlite is supposed to accept UTF8 characters in the db name, but somehow doesn't (at least my version, which is somewhat older). I have worked around this issue by first cd-ing to the directory and then starting sqlite i.e. along the lines of "cd <previews-dir> && sqlite3 previews.db" This seems to work so far, even if some new issues have come up of which I don't know yet whether they are related to this or not.

Sign up

To post, reply, or follow discussions, please sign in with your Adobe ID.

Sign in to Adobe Community

To post, reply, or follow discussions, please sign in with your Adobe ID.

Scanning file for viruses.

This file cannot be downloaded