Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

Change mutiple finds - Erase all Diaritics

Guru ,
Apr 22, 2012 Apr 22, 2012

Hi all,

I made a script to remove a range of diacritics from selected text (The Squiggly bits at the top and bottom of letters) which works but I thought could be made more efficient by using findText().

My question is: Can one search for a range of unicodes (or for that matter a list of words like mom, mum, mommy, mummy, mam etc.) so that they can be deleted or changed to the same thing, (in the case of the word list to mother). without have to loop through every character in the selection?

my script is:


#target "InDesign"

app.doScript("main()", ScriptLanguage.javascript, undefined, UndoModes.FAST_ENTIRE_SCRIPT, "Remove Vowels");

function main()

{

var cc, t, w, x, d, q;

cc=0

t = app.selection[0];

w = new Array;

x = new Array;

for(d=0; d<t.characters.length-1; d++){

    w=t.characters;

try{ 

  

  myCharacter= w;

    myChar=myCharacter.contents;

    unicode=myChar.charCodeAt (0);

// Unicode range to remove

if  (((unicode > (0x0590) && unicode  < (0x05BE))||

        (unicode >  (0x05C0) && unicode  < (0x05C3))||

        (unicode >  (0x05C3) && unicode  < (0x05C6)))||

        unicode == (0x05BF)||

        unicode == (0x05C7))

        {x[cc]=d; cc++}

    else

   

}

catch (noUnicode) {};

}   

q=cc-1;

while (q>-1){

    try {

w[x].remove();       

        }

    catch (error) {};

    q--;

    }}

I would also like to know if one can change a unicode range or word list using the regular indesign find / change interface?

Thanks in advance.

Trevor

TOPICS
Scripting
3.0K
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Community Expert , Apr 22, 2012 Apr 22, 2012

Find unicode ranges:

[\x{0590}-\x{05BE}]  (find range 0590-05BE)

[\x{0590}-\x{05BE}\x{05C0}-\x{05C6}]  (find ranges 0590-05BE and 05C0-05C6)

Replace items from a list with a single item:

Find what: \b(mom|mum|mommy|mummy|mam)\b

Replace with mother

You need to do this in the GREP tab.

Peter

Translate
Guru ,
Apr 22, 2012 Apr 22, 2012

I found how to use the main indesign interface for finding a list of words.

Search for GREP then go to Match and then Or.

Should be easy to find out how to script that but I still am doubtfull about the unicode ranges

ScreenShot005.png

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Apr 22, 2012 Apr 22, 2012

Find unicode ranges:

[\x{0590}-\x{05BE}]  (find range 0590-05BE)

[\x{0590}-\x{05BE}\x{05C0}-\x{05C6}]  (find ranges 0590-05BE and 05C0-05C6)

Replace items from a list with a single item:

Find what: \b(mom|mum|mommy|mummy|mam)\b

Replace with mother

You need to do this in the GREP tab.

Peter

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guru ,
Apr 22, 2012 Apr 22, 2012

Peter

Brilliant, I was at least 90% sure that you would the one to answer.

In script it goes

var mySelection = app.selection[0];

app.findGrepPreferences = NothingEnum.nothing;

app.changeGrepPreferences = NothingEnum.nothing;

app.findChangeGrepOptions.includeFootnotes = false;

app.findChangeGrepOptions.includeHiddenLayers = false;

app.findChangeGrepOptions.includeLockedLayersForFind = false;

app.findChangeGrepOptions.includeLockedStoriesForFind = false;

app.findChangeGrepOptions.includeMasterPages = false;

//Unicode Range

app.findGrepPreferences.findWhat = "[<0591>-<05BD>x<05C1>x<05C2>x<05C4>x<05C5>x<05C7>x<05BF>]";

app.changeGrepPreferences.changeTo = NothingEnum.nothing;

mySelection.changeGrep();

app.findGrepPreferences = NothingEnum.nothing;

app.changeGrepPreferences = NothingEnum.nothing;

I found the basic <0591> format in this forum by you from 5 years back http://21.adobe-scripting-indesign.overzone.net/find-change-using-unicode-t1610.html

and your answer above gave away the missing details.

I guess it would be a very good idea to buy this book http://shop.oreilly.com/product/9780596156015.do

This script is countless time quicker than the above one.

Thanks a million.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Apr 22, 2012 Apr 22, 2012

Trevor,

The <0000> format is replaced with the corresponding character in the Find what field, which often makes it barely readable. The \x{0000} format is not replaced, and I find that easier. As to that book, you guess right!

Peter

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guru ,
Apr 22, 2012 Apr 22, 2012

Thanks Peter,

When I wrote about using the <0000> format I was referring to in scripting and not in the grep tab.

I think you must of seen this post in email form and missed the lines of scripting

In scripting

these three options work

app.findGrepPreferences.findWhat =  "[<0591>-<05BD>x<05C1>x<05C2>x<05C4>x<05C5>x<05C7>x<05BF>]";

app.findGrepPreferences.findWhat = "[\u0591-\u05BCx\u05c2x\u05C4x\u05C7x\u05BF]";

app.findGrepPreferences.findWhat = "[֑-ֿxׁxׂxׄxׅxׇ]";

This does not

app.findGrepPreferences.findWhat = "[\x{0591}-\x{05BD}x\x{05C1}x\x{05C2}x\x{05C4}x\x{05C5}x\x{05C7}x\x{05BF}]";

Don't know why.

On the grep tab

the unlucky option is [\u0591-\u05BCx\u05c2x\u05C4x\u05C7x\u05BF] which does not work properly (in fact hardly works at all!).

[\x{0591}-\x{05BD}x\x{05C1}x\x{05C2}x\x{05C4}x\x{05C5}x\x{05C7}x\x{05BF}] scores top for readability

and both [<0591>-<05BD>x<05C1>x<05C2>x<05C4>x<05C5>x<05C7>x<05BF>] and [֑-ֿxׁxׂxׄxׅxׇ] (which as you the [<0591>-<05BD>x<05C1>x<05C2>x<05C4>x<05C5>x<05C7>x<05BF>] becomes [֑-ֿxׁxׂxׄxׅxׇ]) work  but have the readability issue on the one hand and on the  other hand are easier to enter if you can read them.

Anyway I'm quite please that from not knowing any way to use the grep tab or the script app.findGrepPreferences.findWhat = method (beside one diaritic at a time!), now I know 3 for each !

Regards, Trevor

P.s. Plan to get the book later in the day!

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Apr 23, 2012 Apr 23, 2012

If you write out GREP expressions in Javascript to use with findGrep/changeGrep, you must take into account that backslashes inside a Javascript string needs escaping. Therefore you need to double each of them:

\\x{0591}

(etc.)

The "exceptions" -- there are always some -- are \r, \t, and \n, but in fact those aren't as special as they seem. They get translated into literal character codes for Carriage Return, Tab, and Line Feed, and as it happens, those can be fed as well into the findWhat string, even though you cannot type them in the interface (after inserting them with your script, sometimes you can see the GREP find field struggle with trying to display the string).

You could try if the special Unicode GREP group "\p{Mn}" finds all of the non-spacing markers you want to get rid of -- I think this class of commands is mentioned in Peter's book as well.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Apr 23, 2012 Apr 23, 2012

Ah, yes, the unicode properties \p{ }. They're quite useful. Two of my favourites are \p{Zs} 'all spaces except tab and return' and \{Pd} 'all hyphens and dashes'. And yes, all 37 of them described in the book.

Peter

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Enthusiast ,
Apr 23, 2012 Apr 23, 2012

Peter Kahrel wrote:

Ah, yes, the unicode properties \p{ }. They're quite useful. Two of my favourites are \p{Zs} 'all spaces except tab and return' and \{Pd} 'all hyphens and dashes'. And yes, all 37 of them described in the book.

Wow. How have I gone this long without knowing about these? Guess I should have read your book. Here's another resource.

Jeff

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Apr 23, 2012 Apr 23, 2012

It's never too late, Jeff ! That source you mention is indeed very good. It's where I first learnt grep, back in CS2 days. It's not InDesign-specific though, so not everything discussed there applies to InDesign. Good site nevertheless. Those codes are illustrated with an InDesign document here: http://www.kahrel.plus.com/indesign/grep_mapper.html

Peter

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guru ,
Apr 23, 2012 Apr 23, 2012
LATEST

Nice resource Jeff, there's also a nice grep mapper pdf table on Peter's site but for less (just) than $10 I'm sure Peter would second me that it's worth going for the book!!

(Just saw that Peter beat me to it with the mapper)

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guru ,
Apr 23, 2012 Apr 23, 2012

Jongware

I should have been able to figure out the escaping of the \,

Oh well better luck next time.

So now I have another 2 methods for the scripting:

app.findGrepPreferences.findWhat = "[\\x{0591}-\\x{05BD}x\\x{05C1}x\\x{05C2}x\\x{05C4}x\\x{05C5}x\\x{05C7}x\\x{05BF}]";

and

app.findGrepPreferences.findWhat = "\\p{Mn}";

I did try the \p{Mn} method in the script but it didn't work because I didn't escape it.

and in the grep tab another one

\p{Mn}

Well sundenly overwhelmed with choice the winning script is:

var mySelection = app.selection[0];

app.findGrepPreferences = NothingEnum.nothing;

app.changeGrepPreferences = NothingEnum.nothing;

//Unicode Range

app.findGrepPreferences.findWhat = "\\p{Mn}";

app.changeGrepPreferences.changeTo = NothingEnum.nothing;

mySelection.changeGrep();

app.findGrepPreferences = NothingEnum.nothing;

app.changeGrepPreferences = NothingEnum.nothing;

Short and sweet (and quick).

Peter

I kept my word about getting the book and you can test me on the 37 \p{} methods tomorow

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Apr 23, 2012 Apr 23, 2012

If it's brevity you're after:

app.findGrepPreferences = app.changeGrepPreferences = null;
//Unicode Range
app.findGrepPreferences.findWhat = "\\p{Mn}";
app.selection[0].changeGrep();
app.findGrepPreferences = app.changeGrepPreferences = null;

Peter

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guru ,
Apr 23, 2012 Apr 23, 2012

T Y

I think by comparing my original and this finial script, one can see an excellent example of how well and how poorly a script can be made.

Well I'm happy I saw there was a problem and didn't have that "Very British attitude Wink" and did complain, and something did change!

(see towards the bottom of http://21.adobe-scripting-indesign.overzone.net/find-change-using-unicode-t1610.html)

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guru ,
Apr 23, 2012 Apr 23, 2012

Peter

I forgot to mention....

I like the book, I see the basic grep | or function is right on the very first page after the contents although I got a little scared of the python that spat on the second page.

Trevor

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines