Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

Recurvise Grep puzzle

Guide ,
Feb 04, 2019 Feb 04, 2019

Hi Grep enthusiasts.

Here's a puzzle I'm struggling to solve.

I want to set up my FindChangeList.txt file in order to find and replace typographic quotation marks when wrapped by others. (Double quotation)

What I have

Je vous cite, pour mémoire, le procès-verbal de la dernière réunion : « L’Assemblée juge nécessaire de proposer la modification suivante à l’article 8 du règlement : « Le conseil se compose au maximum de neuf membres. » » Jacques Godbout dépeint le climat politique qui régnait dans les années quarante : « Dans le village de Lanoraie, où nous passions les vacances d’été, un curé affublé d’une perruque carrée, le teint blême, terminait ses sermons par la célèbre formule : « L’enfer est rouge, le ciel est bleu. » Nous étions, enfants libéraux, condamnés à l’enfer. » Comme on dit : « Et voilà ! »

What I want

Je vous cite, pour mémoire, le procès-verbal de la dernière réunion : « L’Assemblée juge nécessaire de proposer la modification suivante à l’article 8 du règlement : “Le conseil se compose au maximum de neuf membres.” » Jacques Godbout dépeint le climat politique qui régnait dans les années quarante : « Dans le village de Lanoraie, où nous passions les vacances d’été, un curé affublé d’une perruque carrée, le teint blême, terminait ses sermons par la célèbre formule : “L’enfer est rouge, le ciel est bleu.” Nous étions, enfants libéraux, condamnés à l’enfer. » Comme on dit : « Et voilà ! »

The colored version for a better understanding:

grep.jpg

Of course, the regex must also work if the text is already "clean".

I have the feeling the solution lies in the recursion operator (?R) but it's very undocumented and I can't figure it out.

See : Recursive Regex—Tutorial and (?R) grep recursive pattern …

Any suggestion welcome.

Vinny

745
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Community Expert , Feb 04, 2019 Feb 04, 2019

You don't really need recursion, you're after a single embedding. If the inner quote has no formatting at all, you could do this:

Find what: «[^»]+\K«(.+?)»

Change to: “$1”

If the inner quote does have formatting or index markers, or anchors, whatever, you cant use it, then you need some script.

P.

Translate
Guide ,
Feb 04, 2019 Feb 04, 2019

Try this grep Vinny

find:

«.*\K«(?=.*)|»(?=.*»)

replace with:

"

ezgif.com-video-to-gif(1).gif

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guide ,
Feb 04, 2019 Feb 04, 2019

Hi Vladan.

Thanks a lot for your help.

Unfortunately, that won't do.

First thing is that because of Grep greed, «.*\K« will catch the last « of the paragraph.

We could workaround this by using the shortest match operator «.*?\K«

Problem is that if you replay the query, it will now catch the first « of the second quotation, which should be left untouched.

map.gif

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Feb 04, 2019 Feb 04, 2019

You don't really need recursion, you're after a single embedding. If the inner quote has no formatting at all, you could do this:

Find what: «[^»]+\K«(.+?)»

Change to: “$1”

If the inner quote does have formatting or index markers, or anchors, whatever, you cant use it, then you need some script.

P.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guide ,
Feb 05, 2019 Feb 05, 2019
LATEST

Hi Peter,

thanks a lot for your answer.

You're right, I wrongly focused on the recursion while a negative class was definitely what I needed.

Just added the missing thin spaces and your regex works like a charm.

The only thing that slightly bothers me is that, yes, a variable, or most probably a footnote could sometimes be included in the inner quote.

And the idea is to set up the FindChangeList.txt file, so no scripting can be involved.

Anyhow, it still is a good start.

Thanks a lot

Vinny

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Feb 04, 2019 Feb 04, 2019

. . . which is (or could be) very simple:

app.findGrepPreferences = null;

app.findGrepPreferences.findWhat = '«[^»]+\\K«(.+?)»';

f = app.documents[0].findGrep();

for (i = 0; i < f.length; i++) {

  f.characters[0].contents = '“';

  f.characters[-1].contents = '”';

}

Tested on only a tiny sample.

P.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines