ID Script needed for unicode conversion though external website

Report · Aug 04, 2022

Hi Community,

I want to do the following -

1- I have a indesign document written in non unicode "Kannada"(Indian Language)

2- I know a website which converts Non unicode kannada to unicode, and it doesnt have character limit. (https://aravindavk.in/sanka/)

3- I want to automate the process so that i won't lose formatting while at the same time the text in my document get replaced with unicode (i cant do it through - copy -convert-find and replace as it will take much time.)

4. I want to someone to help in creating a script which convert my documents text through the website mentioned above but in an automated manner.

Any help would be appreciated

Note : this is an educational project and hence i urge everyone to please help here

Report · Aug 04, 2022

Hi @Suhaib Husain

The JavaScript converter used by the website you've mentioned is available in clear:

http://www.kagapa.in/conversion/kn.js

So it's probably not a big deal to have it working locally (which would save much performance over an http transaction) However, as the code is not optimized enough, it won't translate directly into ExtendScript. The too many nested `.replace()` calls will likely lead to some stack overrun (or similar) error.

Any skilled scripter on this forum is able to make it work anyway, and to provide a clean jsx script for InDesign. But this takes time, intelligence, expertise. Even if this is “an educational project [that make you] urge everyone to please help here”, I can't promise you'll find someone charitable enough to do this job blindly for… nothing.

Regards,

Marc

Report · Aug 05, 2022

Hi Marc,

Thank you for responding, educational project doesnt necessarily mean we are looking for pro bono support, we can pay but since its a its an educational project it wont match the industry standared rather it will be paid (some amount) as token of gratitude. I would be greatful if someone can support.

Report · Aug 05, 2022

@Marc Autretcan you help marc?

Report · Aug 05, 2022

can you help marc?

I surely could.

The ANSI-to-Unicode converter is trivial.
However, automating the process “so that (you) won't lose formatting while at the same time the text in (your) document get replaced with unicode” is not trivial. If you want font conversion and character-styling perfectly maintained at the word level while processing every part of the layout (anchored objects, notes, tables…), this sounds to me a serious task. The reason is, ANSI-encoded characters do not align with their Unicode counterpart; e.g “Q” → “\u0C95\u0CBF” (ಕಿ) or “0iÀÄ” → “\u0CAF” (ಯ). This means that character/word/range/paragraph lengths are not preserved during the conversion.

Thus, that would certainly be a interesting script to implement, subject that expertise and working time be fairly remunerated. As I wrote above, that's all the issue. But maybe @Robert at ID-Tasker is your solution, so get in touch with him.

Regards,

Marc

Report · Aug 06, 2022

However, automating the process “so that (you) won't lose formatting while at the same time the text in (your) document get replaced with unicode” is not trivial. If you want font conversion and character-styling perfectly maintained at the word level while processing every part of the layout (anchored objects, notes, tables…), this sounds to me a serious task. The reason is, ANSI-encoded characters do not align with their Unicode counterpart; e.g “Q” → “\u0C95\u0CBF” (ಕಿ) or “0iÀÄ” → “\u0CAF” (ಯ). This means that character/word/range/paragraph lengths are not preserved during the conversion.

I have gotten lost in these weeds myself, repeatedly trying to write a Limon -> Unicode converter for Khmer. Every few years, I say to myself "Okay, I'm lots better at JS now, I should take another stab at the Limon converter" and walk away converterless and totally humbled, every time.

Report · Aug 05, 2022

Maybe try to export as IDML and then try to convert in one go or split into pieces - maybe it won't mess up tags...

ID Script needed for unicode conversion though external website

1 Correct answer

Photos