Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

converting Arabic numbers from String format to Number format

New Here ,
Mar 29, 2009 Mar 29, 2009
Hi,

I have the following code:

var item = myFoundItems.contents; //this is some text in Arabic, followed by an Arabic number. But the type is String. eg. "Page 15"
var pageNumber = item.slice(7); //the page number itself
convPageNumber = Number(pageNumber); //change the string into a number

Now, after the last line, the value of convPageNumber is NaN. I think the problem is that the Arabic number is somehow not recognized...

What can I do?

Thanks,
Ola
TOPICS
Scripting
1.5K
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Mar 29, 2009 Mar 29, 2009
You might try:

pageNumber.charCodeAt(0);

in the JavaScript Console to see what character is actually there where the "1" appears to be.

Dave
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 29, 2009 Mar 29, 2009
You might find: that there is no '1' at all.
There are two ways of putting Arabic digits in fonts. The first is pretty naive and old-fashioned: replacing the
i glyphs
representing '0'..'9' with ones in a suitable Arabic style. It's pretty naive, because the same has been done with, for example, Greek fonts -- 'a' becomes 'alpha', 'b' becomes 'beta', 'c' ... now, wait ...
The 21st century solution is to use Unicode coding for Arabic digits -- moving them from code points U+0030..U+0039 to U+0660..U+0669. That means you can happily use both western and arabic digits in the same font and document. It also means that to enter Arabic digits, your software must be smart 'nuff to check the language (I expect that in an Arabic OS, the OS will take care of that).

Now what number do you get when you inspect the character code? If the used font was a 'hack' one, you will get the code for a '1' (U+0030). But since the Number function
i relies
on that code, and it apparently fails, I'm betting you get the code for a 'Ù¡' -- the Arabic '1', U+0661.

If you need to convert the string to a real number, inspect the string one charcode at a time, making no assumptions about the coding. If its code is inbetween 48 and 57 (inclusive), it's a roman digit; subtract 48 from it to get the numerical value. If it is between 1632 and 1641 (the decimal value of the hex-based U+ codes), simply subtract 1632 to get
i this
numerical value. Any other values could be considered 'end-of-number' -- or, well, except if you have used 'extended Arabic' (U+6F0..U+6F9), 'Devanagari' (U+0966..U+096F), 'Bengali', or any other script system digits as well.

Reconstructing the entire number from these is basic math.
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Mar 30, 2009 Mar 30, 2009
LATEST
Thanks for the replies.
My document indeed was using Arabic digits that are located in a different Unicode position, so the Number function failed.

What I actually did in the end to solve that was pretty primitve:
I put in the script Find/Replace commands to search for all 10 arabic digits and replace them with the regular digits. This solved the problem...

Ola
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines