• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

GREP: Find consecutive sentences that begin with the same word

Community Beginner ,
Jun 24, 2022 Jun 24, 2022

Copy link to clipboard

Copied

 

Please, can you tell me, how to find consecutive sentences that begin with the same word in Adobe Indesign by using GREP search?

 

Let's say that we work on a biography book: "Elvis was a famous musician. Elvis was born in..." and we need to change this to: "Elvis was a famous musician. He was born in...". I can make this edit manualy, I just need to locate such situations, but I don't know how.

 

Thanks

TOPICS
Print , Publish online

Views

718

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Community Expert , Jun 27, 2022 Jun 27, 2022

Another try:

 

(?<!\h)(\u[\l\u]+?)\h[^!?.]+?[!?.]\h\1

 

 

The grep could still be simplified if necessary.

Votes

Translate

Translate
Community Expert ,
Jun 25, 2022 Jun 25, 2022

Copy link to clipboard

Copied

I don't think you can do that with GREP, actually. I think you'd need to parse the whole document. I can imagine doing it in Javascript, but not in a single GREP query. Is there a reason that you need to use regular expressions in particular? Or do you just need a tool, any tool that's not your eyeballs, to automate this editorial preference?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guide ,
Jun 26, 2022 Jun 26, 2022

Copy link to clipboard

Copied

Basically (to be more thought but at the beach at the moment):

 

(\u\H+).+?\.\h\K\1

 

(^/)  The Jedi

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Jun 26, 2022 Jun 26, 2022

Copy link to clipboard

Copied

@Joel Cherney, thanks for your reply.

 

I'm not an expert in these things, so I apologize that that I don't understand everything you said. My english is not very good, too.

 

The simple reason why I need such a GREP code (if it can be arranged) is that it is kind of dull or boring when sentences located next to each other begin with the same word (unless it's poetry or a speech, where such repetition can be useful for rhetorical or poetic effect).

 

For example, imagine a biographical book about a musician or an actor, where every sentence begins with his or her name. That would be dull. That's why I wrote that simple example about Elvis, but it can be Lennon or anyone else.

 

What I need is a GREP code like this:

any word - some text in between - a punctuation that closes a sentence (a period, an exclamation or smth) - that word again (the one from the very beginning)

 

It is possible with GREP to locate duplicated words written by mistake next to eachother (which often happens), but I don't know how to modify that GREP code to suit my needs (if that's possible).

 

The code: \b(\w+)\b \1

 

@FRIdNGE, thank you for your reply. Unfortunately your code didn't work for me, but I appreciate it.

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jun 27, 2022 Jun 27, 2022

Copy link to clipboard

Copied

@Sotir25004881oiei  Michel's code does work, though only with periods as sentence-final markers.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Jun 27, 2022 Jun 27, 2022

Copy link to clipboard

Copied

@Peter Kahreland @FRIdNGE , it seems that you are right, thank you. But unfortunately, it doesn't work for my document, maybe because it is not in english (it's in cyrillic script). In my case, the search just catches random words and each of them is located at the beginning of a sentence.

 

Then, I added some text in english in the document as a test and it worked (e.g. "Elvis was a famous musician. Elvis was born in..." or it can be smth like: "Orange juice is made out of fruits. Orange juice is good.").

 

Maybe there's a way to modify this code, so it can work for me. Thanks.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jun 27, 2022 Jun 27, 2022

Copy link to clipboard

Copied

@Sotir25004881oiei 

Please write some real text that does not work as desired in Cyrillic here in the forum.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Jun 27, 2022 Jun 27, 2022

Copy link to clipboard

Copied

@pixxxelschubser , thanks for your reply and an update:

 

This code by @FRIdNGE actually works with cyrillic, I'm sorry I was wrong. But it works only in simple examples such as this (cyrillic script, you can copy/paste in your Indesign and search within):

 

"Физиката е естествена наука, която изучава общите и фундаментални закономерности, които определят изграждането и еволюцията на материалния свят. Физиката е точна наука, което означава, че се занимава с количественото описание на природните явления.".

 

But here's a more complicated example. You'll notice that the search will find the same word as the first one in the paragraph, but the duplicate is not located in the next sentence, so this is not what we want:


"Математиците търсят определени „образци на шаблони“, за да формулират нови теореми, аксиоми и типове математически доказателства. Когато откритите и изучени математически структури са базирани на добри (идеални, репетативни или количествено обозрими) модели, може да се използва математическо доказателство при създаването на научни прогнози и предвиждания за определени теми, области или обекти. Интересни дискусии и аргументи първо се появяват в древногръцката математика (а преди това, тоест още от предисторията се използва за изчисляване, измерване и за изучаване на формите и движенията на физическите обекти чрез дедуктивни разсъждения и абстракции), а по-късно математиката се развива в доста сложна и многостранна наука за абстрактни количествени и качествени връзки, форми и структури, с нейните аксиоматични системи от късния 19 век, като вече се приема за обичайно да се разглеждат математическите изследвания като установяване на математическата и научна истината чрез строги дедукции с използване на избрани аксиоми, научни дефиниции и определения. Математиката е от съществено значение в много области, включително естествени науки, инженерство, медицина, финанси и социални науки. Приложна математика доведе до изцяло нови математически дисциплини, като статистика и теория на игрите. Математиците се занимават с чиста математика (математика заради себе си), без да имат предвид каквото и да било приложение, но практическите приложения за това, което започна като чиста математика, често се откриват по-късно".

 

Conclusion: That's why when I try to search in the actual book, which is quite big, it catches some words that seem just random to me. Maybe they're identified as duplicates of something that appeared long before. Maybe several sentences before. And then, after a long search (find next, find next...), it actually finds the things that I'm really looking for, but this is too complicated. This code is not bad, it partially works, but it needs a little bit of improvement.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jun 27, 2022 Jun 27, 2022

Copy link to clipboard

Copied

That's probably because i the first intance the quotation mark is included.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Jun 27, 2022 Jun 27, 2022

Copy link to clipboard

Copied

quote

That's probably because i the first intance the quotation mark is included.


By @Peter Kahrel

 

I'm afraid that this has nothing to do with the quotation mark. Here's a translation of that cyrillic text in english without any quotation marks. Please copy it to an Indesign document and use the code composed by @FRIdNGE to search within:

 

(beginning of text)

 

Mathematicians look for specific patterns to formulate new theorems, axioms, and types of mathematical proofs. When the mathematical structures discovered and studied are based on good (ideal, repetitive or quantitatively observable) models, mathematical proof can be used in making scientific predictions and predictions for specific topics, areas or objects. Interesting discussions and arguments first appeared in ancient Greek mathematics (and before that, that is, from prehistory it was used to calculate, measure and study the shapes and movements of physical objects through deductive reasoning and abstractions), and later mathematics developed in quite complex and multifaceted science of abstract quantitative and qualitative relations, forms and structures, with its axiomatic systems of the late 19th century, and it is now accepted to consider mathematical research as establishing mathematical and scientific truth by rigorous deductions using selected axioms, scientific definitions and definitions. Mathematics is essential in many fields, including science, engineering, medicine, finance, and the social sciences. Applied mathematics has led to entirely new mathematical disciplines, such as statistics and game theory. Mathematicians deal with pure mathematics

 

(end of text)

 

We see that the paragraph begins with the word Mathematicians. The GREP search will find the duplicate in the text, but it is not in the next sentence. It's in the last sentence.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jun 27, 2022 Jun 27, 2022

Copy link to clipboard

Copied

Indeed, the quote has nothing to do with it. Anyway, it works fine for me. Your example contains two instances of Математиците: at the start of the text and at the beginning of the last sentence. 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guide ,
Jun 27, 2022 Jun 27, 2022

Copy link to clipboard

Copied

Try this:

 

Find: (\u\w+\b)[^.]+\.\K\h(?=\1)

Replace by: $0@

 

What it does: The Regex searches the next same word in a following sentence and add a "@"!

 

Then you just need to play:

 

Find: @\H+

Replace by: He

 

… If only men!  😉

 

(^/)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Jun 27, 2022 Jun 27, 2022

Copy link to clipboard

Copied

quote

Indeed, the quote has nothing to do with it. Anyway, it works fine for me. Your example contains two instances of Математиците: at the start of the text and at the beginning of the last sentence. 


By @Peter Kahrel

 

Yes, it works for me like that, too. But this does not solve my problem. I would like to find only consecutive sentences that begin with the same word, i.e. sentences written one after another.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jun 27, 2022 Jun 27, 2022

Copy link to clipboard

Copied

It may be a bit silly, but you could try:

(find from - to)

(\u[\l\u]+?)\h[^!?.]+?[!?.]\h\1

or find only consecutive (second) instance

(\u[\l\u]+?)\h[^!?.]+?[!?.]\h\K\1

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jun 27, 2022 Jun 27, 2022

Copy link to clipboard

Copied

Ah -- I lost track of 'consecutive' 🙂

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Jun 27, 2022 Jun 27, 2022

Copy link to clipboard

Copied

 

It may be a bit silly, but you could try:

(find from - to)

 

 

(\u[\l\u]+?)\h[^!?.]+?[!?.]\h\1

 

 

or find only consecutive (second) instance

 

 

(\u[\l\u]+?)\h[^!?.]+?[!?.]\h\K\1

 

 

 


By @pixxxelschubser

 

@pixxxelschubser, thank you very much. Your solutions are very close to what I was looking for. They're not perfect (no offence), but they're very close. And I find your first code very useful, because it highlights the text ("from/to").

 

The problem is that the 1st occurence of the word (let's call it "the original") and the 2nd occurence (let's call it "a duplicate") should each be located in the beginning of their respective sentences.

 

Your code is able to find that, which is great, but it also finds situations where the "original" is somewhere in the middle of the sentence. For example try this: "There are many popular musicians, but Elvis is the king. Elvis was born in...". Your code will find the first "Elvis" even though it's not in the beginning of the sentence. The comma does not matter here, you can try without it.

 

@FRIdNGE, thank you very much, too. Your solutions are also close to what I was looking for. It's just I would avoid automatic replacement (I'm afraid not to mess something) and I would just locate the problem and edit it manualy.

 

Thanks to both of you once again. These codes are a step in the right direction and they just need a little bit of tweaking.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jun 27, 2022 Jun 27, 2022

Copy link to clipboard

Copied

Another try:

 

(?<!\h)(\u[\l\u]+?)\h[^!?.]+?[!?.]\h\1

 

 

The grep could still be simplified if necessary.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jun 29, 2022 Jun 29, 2022

Copy link to clipboard

Copied

@Sotir25004881oiei 

Have you tried it?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Jun 30, 2022 Jun 30, 2022

Copy link to clipboard

Copied

quote

@Sotir25004881oiei 

Have you tried it?


By @pixxxelschubser

 

Yes, thank you so much! I marked it as a correct answer!

 

I was very busy these days, so I'm sorry that I was absent from the forum, that's why I am late with my reply. This is fantastic, thank you so much

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 02, 2022 Jul 02, 2022

Copy link to clipboard

Copied

LATEST

I love being proven wrong when I say something foolish like "I don't think you can do that with GREP in InDesign." 🙂

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines