Copy link to clipboard
Copied
Hi Grep enthusiasts.
Here's a puzzle I'm struggling to solve.
I want to set up my FindChangeList.txt file in order to find and replace typographic quotation marks when wrapped by others. (Double quotation)
What I have
Je vous cite, pour mémoire, le procès-verbal de la dernière réunion : « L’Assemblée juge nécessaire de proposer la modification suivante à l’article 8 du règlement : « Le conseil se compose au maximum de neuf membres. » » Jacques Godbout dépeint le climat politique qui régnait dans les années quarante : « Dans le village de Lanoraie, où nous passions les vacances d’été, un curé affublé d’une perruque carrée, le teint blême, terminait ses sermons par la célèbre formule : « L’enfer est rouge, le ciel est bleu. » Nous étions, enfants libéraux, condamnés à l’enfer. » Comme on dit : « Et voilà ! »
What I want
Je vous cite, pour mémoire, le procès-verbal de la dernière réunion : « L’Assemblée juge nécessaire de proposer la modification suivante à l’article 8 du règlement : “Le conseil se compose au maximum de neuf membres.” » Jacques Godbout dépeint le climat politique qui régnait dans les années quarante : « Dans le village de Lanoraie, où nous passions les vacances d’été, un curé affublé d’une perruque carrée, le teint blême, terminait ses sermons par la célèbre formule : “L’enfer est rouge, le ciel est bleu.” Nous étions, enfants libéraux, condamnés à l’enfer. » Comme on dit : « Et voilà ! »
The colored version for a better understanding:
Of course, the regex must also work if the text is already "clean".
I have the feeling the solution lies in the recursion operator (?R) but it's very undocumented and I can't figure it out.
See : Recursive Regex—Tutorial and (?R) grep recursive pattern …
Any suggestion welcome.
Vinny
You don't really need recursion, you're after a single embedding. If the inner quote has no formatting at all, you could do this:
Find what: «[^»]+\K«(.+?)»
Change to: “$1”
If the inner quote does have formatting or index markers, or anchors, whatever, you cant use it, then you need some script.
P.
Copy link to clipboard
Copied
Try this grep Vinny
find:
«.*\K«(?=.*)|»(?=.*»)
replace with:
"
Copy link to clipboard
Copied
Hi Vladan.
Thanks a lot for your help.
Unfortunately, that won't do.
First thing is that because of Grep greed, «.*\K« will catch the last « of the paragraph.
We could workaround this by using the shortest match operator «.*?\K«
Problem is that if you replay the query, it will now catch the first « of the second quotation, which should be left untouched.
Copy link to clipboard
Copied
You don't really need recursion, you're after a single embedding. If the inner quote has no formatting at all, you could do this:
Find what: «[^»]+\K«(.+?)»
Change to: “$1”
If the inner quote does have formatting or index markers, or anchors, whatever, you cant use it, then you need some script.
P.
Copy link to clipboard
Copied
Hi Peter,
thanks a lot for your answer.
You're right, I wrongly focused on the recursion while a negative class was definitely what I needed.
Just added the missing thin spaces and your regex works like a charm.
The only thing that slightly bothers me is that, yes, a variable, or most probably a footnote could sometimes be included in the inner quote.
And the idea is to set up the FindChangeList.txt file, so no scripting can be involved.
Anyhow, it still is a good start.
Thanks a lot
Vinny
Copy link to clipboard
Copied
. . . which is (or could be) very simple:
app.findGrepPreferences = null;
app.findGrepPreferences.findWhat = '«[^»]+\\K«(.+?)»';
f = app.documents[0].findGrep();
for (i = 0; i < f.length; i++) {
f.characters[0].contents = '“';
f.characters[-1].contents = '”';
}
Tested on only a tiny sample.
P.
Find more inspiration, events, and resources on the new Adobe Community
Explore Now