Copy link to clipboard
Copied
Hi all, I am trying to import data in order to fill my form fields and am having trouble successfully displaying special characters such as accents and umlauts. The data is saved in a tab-delimited text file which displays these characters just fine, and I even tried saving it as a tab-delimited .txt file with utf-8 encoding, still no luck. My fonts for the fields are set to Helvetica, so I don't see why displaying these characters shouldn't be possible. Any ideas as to what is going wrong? Thanks!!
Copy link to clipboard
Copied
How are you importing the data? Do you use a script, or the built-in Import Form Data command?
Can you share a sample PDF and text file?
Copy link to clipboard
Copied
I've tried with the built in Import Form Data command, and with the following script, which I want to eventually get working so that I can do batch imports:
/* Batch Import and Save */
var fileName = "/Users/rightsup/Documents/newlabels.txt"; // the tab delimited text file containing the data
var outputDir = "/Users/rightsup/Documents/BatchImportTrials/";
var err = 0;
var idx = 0;
var str = "";
while (err == 0) {
err = this.importTextData(fileName, idx);
if (err == -1)
app.alert("Error: Cannot Open File");
else if (err == -2)
app.alert("Error: Cannot Load Data");
else if (err == 1)
app.alert("Warning: User Cancelled File Select");
else if (err == 2)
app.alert("Warning: User Cancelled Row Select");
else if (err == 3)
app.alert("Warning: Missing Data");
else if (err == 0)
str = this.getField("LabelName").value;
str = str.replace(/[^a-zA-Z0-9]/g, '');
this.saveAs(outputDir + str + ".pdf");
idx++;
}
It is a super simple PDF form with just three fields, each set to Helvetica. The .txt file was originally in Excel, and then saved as a tab-delimited file. I also tried saving it from Open Office as a tab-delimited file with uff-8 encoding, which does change the appearance of the special characters on my pdf form, but still not to display the correct ones.
Copy link to clipboard
Copied
Just tried it with a plain-text file and the built-in command. When the file was encoded as UTF-8 it actually didn't work correctly. But when I encoded it as ANSI it did...
Copy link to clipboard
Copied
How can I encode it with ANSI? I saved my spreadsheet from Open Office as Windows 1252 and it still didn't work. It now tried some symbols but got them incorrect (incorrect french accents) and still didn't display umlauts properly at all. Maybe it would also be useful to know I am trying all of this on a Mac...? No idea, beginner with this stuff. Thanks so much for your help.
Copy link to clipboard
Copied
In my case, some characters display incorrectly in the PDF when the import file is in UTF-8 without BOM, but if the encoding is converted to UTF-8 (with BOM) all characters display correctly. Since I was not able to save the import data as TSV file in UTF-8 with BOM in MS Excel, I used Notepad++ to aid the process.
Copy link to clipboard
Copied
There are many possible encodings (charsets, code pages) for files. It is not clear what encoding Excel would use. If you work without knowing encoding you end up with a mess called mojibake. Unfortunately, the Acrobat JavaScript documentation is silent on this very important point which makes a lot of work!
To find out what encoding you need, try this:
1. Fill a form with your non-ASCII characters.
2. Use the exportAsText method.
3. On a clean form, use importTextData
4. Check that this import has worked
5. If it worked, analyse the byte values in the text file to see what encoding was used.
Copy link to clipboard
Copied
Thanks for your help. I tried this and the import did not work, it displays the characters incorrectly, as something like this: ˛˝˛ &" in case that helps clarify the issue? The exported text data displays correctly in the text file however.
Copy link to clipboard
Copied
You mean that if you export then import the same file, it doesn't work? Not entirely clear, sorry, as there are several different suggestions here.
Copy link to clipboard
Copied
Yeah exactly. I type the characters I want in the fields and they look find and then export and the text file looks fine and then import again and special characters are not displayed correctly.
Copy link to clipboard
Copied
It's possible your text editor re-encodes the file when you save it.
I use Notepad++, where you can explicitly tell it what encoding to use, as well as convert one encoding to another, but I think it's a Windows only application. I'm not sure what the equivalent for Mac would be...
Copy link to clipboard
Copied
Hmm, that sounds very broken. I don't have a suggestion in that case, sorry.
Copy link to clipboard
Copied
I write it out in Word, and then copy and paste it into pdf form.