Copy link to clipboard
Copied
Hi all,
I made a script to remove a range of diacritics from selected text (The Squiggly bits at the top and bottom of letters) which works but I thought could be made more efficient by using findText().
My question is: Can one search for a range of unicodes (or for that matter a list of words like mom, mum, mommy, mummy, mam etc.) so that they can be deleted or changed to the same thing, (in the case of the word list to mother). without have to loop through every character in the selection?
my script is:
#target "InDesign"
app.doScript("main()", ScriptLanguage.javascript, undefined, UndoModes.FAST_ENTIRE_SCRIPT, "Remove Vowels");
function main()
{
var cc, t, w, x, d, q;
cc=0
t = app.selection[0];
w = new Array;
x = new Array;
for(d=0; d<t.characters.length-1; d++){
w
try{
myCharacter= w
myChar=myCharacter.contents;
unicode=myChar.charCodeAt (0);
// Unicode range to remove
if (((unicode > (0x0590) && unicode < (0x05BE))||
(unicode > (0x05C0) && unicode < (0x05C3))||
(unicode > (0x05C3) && unicode < (0x05C6)))||
unicode == (0x05BF)||
unicode == (0x05C7))
{x[cc]=d; cc++}
else
}
catch (noUnicode) {};
}
q=cc-1;
while (q>-1){
try {
w[x].remove();
}
catch (error) {};
q--;
}}
I would also like to know if one can change a unicode range or word list using the regular indesign find / change interface?
Thanks in advance.
Trevor
Find unicode ranges:
[\x{0590}-\x{05BE}] (find range 0590-05BE)
[\x{0590}-\x{05BE}\x{05C0}-\x{05C6}] (find ranges 0590-05BE and 05C0-05C6)
Replace items from a list with a single item:
Find what: \b(mom|mum|mommy|mummy|mam)\b
Replace with mother
You need to do this in the GREP tab.
Peter
Copy link to clipboard
Copied
I found how to use the main indesign interface for finding a list of words.![]()
Search for GREP then go to Match and then Or.
Should be easy to find out how to script that but I still am doubtfull about the unicode ranges

Copy link to clipboard
Copied
Find unicode ranges:
[\x{0590}-\x{05BE}] (find range 0590-05BE)
[\x{0590}-\x{05BE}\x{05C0}-\x{05C6}] (find ranges 0590-05BE and 05C0-05C6)
Replace items from a list with a single item:
Find what: \b(mom|mum|mommy|mummy|mam)\b
Replace with mother
You need to do this in the GREP tab.
Peter
Copy link to clipboard
Copied
Peter
Brilliant, I was at least 90% sure that you would the one to answer.
In script it goes
var mySelection = app.selection[0];
app.findGrepPreferences = NothingEnum.nothing;
app.changeGrepPreferences = NothingEnum.nothing;
app.findChangeGrepOptions.includeFootnotes = false;
app.findChangeGrepOptions.includeHiddenLayers = false;
app.findChangeGrepOptions.includeLockedLayersForFind = false;
app.findChangeGrepOptions.includeLockedStoriesForFind = false;
app.findChangeGrepOptions.includeMasterPages = false;
//Unicode Range
app.findGrepPreferences.findWhat = "[<0591>-<05BD>x<05C1>x<05C2>x<05C4>x<05C5>x<05C7>x<05BF>]";
app.changeGrepPreferences.changeTo = NothingEnum.nothing;
mySelection.changeGrep();
app.findGrepPreferences = NothingEnum.nothing;
app.changeGrepPreferences = NothingEnum.nothing;
I found the basic <0591> format in this forum by you from 5 years back http://21.adobe-scripting-indesign.overzone.net/find-change-using-unicode-t1610.html
and your answer above gave away the missing details.
I guess it would be a very good idea to buy this book http://shop.oreilly.com/product/9780596156015.do
This script is countless time quicker than the above one.
Thanks a million.
Copy link to clipboard
Copied
Trevor,
The <0000> format is replaced with the corresponding character in the Find what field, which often makes it barely readable. The \x{0000} format is not replaced, and I find that easier. As to that book, you guess right!
Peter
Copy link to clipboard
Copied
Thanks Peter,
When I wrote about using the <0000> format I was referring to in scripting and not in the grep tab.
I think you must of seen this post in email form and missed the lines of scripting ![]()
In scripting
these three options work ![]()
app.findGrepPreferences.findWhat = "[<0591>-<05BD>x<05C1>x<05C2>x<05C4>x<05C5>x<05C7>x<05BF>]";
app.findGrepPreferences.findWhat = "[\u0591-\u05BCx\u05c2x\u05C4x\u05C7x\u05BF]";
app.findGrepPreferences.findWhat = "[֑-ֿxׁxׂxׄxׅxׇ]";
This does not ![]()
app.findGrepPreferences.findWhat = "[\x{0591}-\x{05BD}x\x{05C1}x\x{05C2}x\x{05C4}x\x{05C5}x\x{05C7}x\x{05BF}]";
Don't know why.![]()
On the grep tab
the unlucky option is [\u0591-\u05BCx\u05c2x\u05C4x\u05C7x\u05BF] which does not work properly (in fact hardly works at all!).
[\x{0591}-\x{05BD}x\x{05C1}x\x{05C2}x\x{05C4}x\x{05C5}x\x{05C7}x\x{05BF}] scores top for readability
and both [<0591>-<05BD>x<05C1>x<05C2>x<05C4>x<05C5>x<05C7>x<05BF>] and [֑-ֿxׁxׂxׄxׅxׇ] (which as you the [<0591>-<05BD>x<05C1>x<05C2>x<05C4>x<05C5>x<05C7>x<05BF>] becomes [֑-ֿxׁxׂxׄxׅxׇ]) work but have the readability issue on the one hand and on the other hand are easier to enter if you can read them.
Anyway I'm quite please that from not knowing any way to use the grep tab or the script app.findGrepPreferences.findWhat = method (beside one diaritic at a time!), now I know 3 for each ! ![]()
Regards, Trevor
P.s. Plan to get the book later in the day!
Copy link to clipboard
Copied
If you write out GREP expressions in Javascript to use with findGrep/changeGrep, you must take into account that backslashes inside a Javascript string needs escaping. Therefore you need to double each of them:
\\x{0591}
(etc.)
The "exceptions" -- there are always some -- are \r, \t, and \n, but in fact those aren't as special as they seem. They get translated into literal character codes for Carriage Return, Tab, and Line Feed, and as it happens, those can be fed as well into the findWhat string, even though you cannot type them in the interface (after inserting them with your script, sometimes you can see the GREP find field struggle with trying to display the string).
You could try if the special Unicode GREP group "\p{Mn}" finds all of the non-spacing markers you want to get rid of -- I think this class of commands is mentioned in Peter's book as well.
Copy link to clipboard
Copied
Ah, yes, the unicode properties \p{ }. They're quite useful. Two of my favourites are \p{Zs} 'all spaces except tab and return' and \{Pd} 'all hyphens and dashes'. And yes, all 37 of them described in the book.
Peter
Copy link to clipboard
Copied
Peter Kahrel wrote:
Ah, yes, the unicode properties \p{ }. They're quite useful. Two of my favourites are \p{Zs} 'all spaces except tab and return' and \{Pd} 'all hyphens and dashes'. And yes, all 37 of them described in the book.
Wow. How have I gone this long without knowing about these? Guess I should have read your book. Here's another resource.
Jeff
Copy link to clipboard
Copied
It's never too late, Jeff
! That source you mention is indeed very good. It's where I first learnt grep, back in CS2 days. It's not InDesign-specific though, so not everything discussed there applies to InDesign. Good site nevertheless. Those codes are illustrated with an InDesign document here: http://www.kahrel.plus.com/indesign/grep_mapper.html
Peter
Copy link to clipboard
Copied
Nice resource Jeff, there's also a nice grep mapper pdf table on Peter's site but for less (just) than $10 I'm sure Peter would second me that it's worth going for the book!!
(Just saw that Peter beat me to it with the mapper)
Copy link to clipboard
Copied
Jongware
I should have been able to figure out the escaping of the \,
Oh well better luck next time.
So now I have another 2 methods for the scripting:
app.findGrepPreferences.findWhat = "[\\x{0591}-\\x{05BD}x\\x{05C1}x\\x{05C2}x\\x{05C4}x\\x{05C5}x\\x{05C7}x\\x{05BF}]";
and
app.findGrepPreferences.findWhat = "\\p{Mn}";
I did try the \p{Mn} method in the script but it didn't work because I didn't escape it.
and in the grep tab another one
\p{Mn}
Well sundenly overwhelmed with choice the winning script is:
var mySelection = app.selection[0];
app.findGrepPreferences = NothingEnum.nothing;
app.changeGrepPreferences = NothingEnum.nothing;
//Unicode Range
app.findGrepPreferences.findWhat = "\\p{Mn}";
app.changeGrepPreferences.changeTo = NothingEnum.nothing;
mySelection.changeGrep();
app.findGrepPreferences = NothingEnum.nothing;
app.changeGrepPreferences = NothingEnum.nothing;
Short and sweet (and quick).
Peter
I kept my word about getting the book and you can test me on the 37 \p{} methods tomorow ![]()
Copy link to clipboard
Copied
If it's brevity you're after:
app.findGrepPreferences = app.changeGrepPreferences = null;
//Unicode Range
app.findGrepPreferences.findWhat = "\\p{Mn}";app.selection[0].changeGrep();
app.findGrepPreferences = app.changeGrepPreferences = null;
Peter
Copy link to clipboard
Copied
T Y
I think by comparing my original and this finial script, one can see an excellent example of how well and how poorly a script can be made.
Well I'm happy I saw there was a problem and didn't have that "Very British attitude
" and did complain, and something did change!
(see towards the bottom of http://21.adobe-scripting-indesign.overzone.net/find-change-using-unicode-t1610.html)
Copy link to clipboard
Copied
Peter
I forgot to mention....
I like the book, I see the basic grep | or function is right on the very first page after the contents although I got a little scared of the python that spat on the second page.
Trevor
Find more inspiration, events, and resources on the new Adobe Community
Explore Now