Acrobat X Form Font Display Errors, Document JS in Unicode but displays in ANSI.

Report · Nov 06, 2016

Hi everyone,

looking for some help on font display issues.

I have a form which has data contained within the document level javascript. The selection of one drop down in the form populates all the other fields on the form appropriately with data from the document level javascript.

The issue I'm having is that the data is encoded in UTF-8, as it is related to chemicals and has various symbols that are not present in ANSI. When I look at the data in Notepad++ it looks perfect. However when I use the form the display is garbled if there is a "greater than or equal to symbol" or similar character.

The form uses arial, it was imported from a Word 2010 .docx file. The form fields are all set to use arial but the font info for the file says that all the fonts are using ANSI encoding. I guess this is the source of my problem.

I have tried to change the form fields to use a MS Arial Unicode but it still insists on encoding the fonts as ANSI, rather than Unicode.

Is there any way to set the fonts in the document to use Unicode, rather than ANSI encoding, so my form displays correctly?

Many thanks

Report · Nov 07, 2016

You should try encoding the code in ANSI and then replace the special Unicode characters in it with escaped characters.

For example, the degrees symbol (http://www.fileformat.info/info/unicode/char/b0/index.htm ) can be replaced by "\u00B0".

Report · Nov 07, 2016

Thanks for the suggestion but was hoping for an easier way, as some of the chemical names contain multiple non-ANSI characters. I was hoping this could be easily updated in future by anyone but the method you suggest looks like it might be too complicated for that purpose

Report · Nov 07, 2016

UTF-8 is not an acceptable encoding for Acrobat JavaScript so far as I know. It will just look like mojibake.

Report · Nov 07, 2016

Thanks for the reply, though it looks like a rather major spanner has been thrown in the works.

Do you know of any other way that data can easily be embedded in a PDF form, that would also accept unicode characters?

Cheers

Report · Nov 07, 2016

There are various ways it can be done so you don't have to hard-code the Unicode characters into your code.

For example, you can attach to the file a plain-text (UTF-8) that contains the codes for each symbol as well as a simple name (like DEGREES;°), and then create a function that reads that file and returns the correct symbol based on the name. So something like this:

function getUnicodeChar(name) {
    // read attached data file
    // loop over the values in it
    // return the found unicode char
}
event.value = "90"+getUnicodeChar("DEGREES");

Report · Nov 07, 2016

Some JavaScript implementations accept UCS-2 or UTF-16. Not sure if Acrobat does, has anyone tried this?

Report · Nov 07, 2016

I think it depends what kind of characters you're using... This code does work when executed from the Console, for example:

this.getField("Text1").value = "90°";

Report · Nov 11, 2016

Thanks for the example and all the help, looks like a good bit of knowledge to have. Unfortunately for the intended end user I fear this method would be a bit too much. So i shall come up with a plan C that doesn't involve using unicode.

Thanks again

Report · Nov 11, 2016

Thanks for the suggestion but unfortunately neither of these options worked. Think I'll have to rethink my approach and come up with an easier to maintain solution.

Thanks again

Report · Nov 07, 2016

I think the console would be using Windows or Mac Unicode (UCS-16BE), but that doesn't mean it can read a file in that format unfortunately. But it could be worth a try.