Skip to main content
Inspiring
March 20, 2024
Answered

GREP find and replace identical values

  • March 20, 2024
  • 5 replies
  • 1928 views

Hi Knowledge People

What is the simplest syntax to find identical "values" (may be words or numbers) in a document?

Thanks

This topic has been closed for replies.
Correct answer Peter Kahrel

Thanks for the file. When I tried the GREP queries on your (big) file I noticed that those pesky inline graphics get in the way. They can't be accommodated in the queries I gave, so you need two additional queries.

 

This one goes first, before anything else. It places a tab before an inline:

Find what: (?=~a)
Change to: ~y
Find format: <Leave empty>
Change format: <Leave empty>

 Then run this one at the very end to delete the tabs before the inlines:

Find what: ~y(?=~a)
Change to: <Leave empty>
Find format: <Leave empty>
Change format: <Leave empty>

So that's five queries altogether. If you use these more than once you can use this script to run them altogether:

https://creativepro.com/files/kahrel/indesign/grep_query_runner.html

5 replies

Peter Kahrel
Community Expert
Peter KahrelCommunity ExpertCorrect answer
Community Expert
March 20, 2024

Thanks for the file. When I tried the GREP queries on your (big) file I noticed that those pesky inline graphics get in the way. They can't be accommodated in the queries I gave, so you need two additional queries.

 

This one goes first, before anything else. It places a tab before an inline:

Find what: (?=~a)
Change to: ~y
Find format: <Leave empty>
Change format: <Leave empty>

 Then run this one at the very end to delete the tabs before the inlines:

Find what: ~y(?=~a)
Change to: <Leave empty>
Find format: <Leave empty>
Change format: <Leave empty>

So that's five queries altogether. If you use these more than once you can use this script to run them altogether:

https://creativepro.com/files/kahrel/indesign/grep_query_runner.html

Peter Kahrel
Community Expert
Community Expert
March 20, 2024

You can do it with three queries. First mark all lines with identical first values, don't mark the first occurrence:

Find what: ^(.+?~y).+\r\K(\1.+\r)+
Change to: <Leave empty>
Find format: <Leave empty>
Change format: +Strikethrough

Then delete everything that has strikethru applied up to the tab:

Find what: ^.+?(?=~y)
Change to: <Leave empty>
Find format: +Strikethrough
Change format: <Leave empty>

Finally, remove all strikethrough:

Find what: <Leave empty>
Change to: <Leave empty>
Find format: +Strikethrough
Change format: -Strikethrough
Gioyer07Author
Inspiring
March 20, 2024

Thank you Peter

I uploaded the IDD document here. 

I should have done it straight at the beginning.

Great help and kindness.

 

Robert at ID-Tasker
Legend
March 20, 2024
quote

Thank you Peter

I uploaded the IDD document here. 

I should have done it straight at the beginning.

Great help and kindness.


By @Gioyer07

 

You need to do "purge" / "garbage collection" - do SAVE AS with a new name from time to time.

 

 

OK, looks like you've done Save As earlier today:

 

Recovered MiniSave on Windows x64 10.0 in app version 17.4.1.67 (FS InDesign Roman) build 67 on 04 April 2023 at 11:15
Save As on Windows x64 10.0 in app version 17.4.1.67 (FS InDesign Roman) build 67 on 20 March 2024 at 09:05
Book - repaginate on Windows x64 10.0 in app version 17.4.1.67 (FS InDesign Roman) build 67 on 20 March 2024 at 09:08
Book - repaginate on Windows x64 10.0 in app version 17.4.1.67 (FS InDesign Roman) build 67 on 20 March 2024 at 09:17
Book - repaginate on Windows x64 10.0 in app version 17.4.1.67 (FS InDesign Roman) build 67 on 20 March 2024 at 09:41
Book - repaginate on Windows x64 10.0 in app version 17.4.1.67 (FS InDesign Roman) build 67 on 20 March 2024 at 15:06
Book - repaginate on Windows x64 10.0 in app version 17.4.1.67 (FS InDesign Roman) build 67 on 20 March 2024 at 15:06
Book - repaginate on Windows x64 10.0 in app version 17.4.1.67 (FS InDesign Roman) build 67 on 20 March 2024 at 15:07
Book - repaginate on Windows x64 10.0 in app version 17.4.1.67 (FS InDesign Roman) build 67 on 20 March 2024 at 15:07
Book - repaginate on Windows x64 10.0 in app version 17.4.1.67 (FS InDesign Roman) build 67 on 20 March 2024 at 15:09
Most recent Save on Windows x64 10.0 in app version 17.4.1.67 (FS InDesign Roman) build 67 on 20 March 2024 at 15:09
Open As Copy on Windows x64 10.0 in app version 19.3.0.58 (FS InDesign Roman) build 58 on 20 March 2024 at 15:41
Converted on Windows x64 10.0 in app version 19.3.0.58 (FS InDesign Roman) build 58 on 20 March 2024 at 15:41
Save As on Windows x64 10.0 in app version 19.3.0.58 (FS InDesign Roman) build 58 on 20 March 2024 at 15:42

 

But those repaginations added 70MB??

 

 

Peter Kahrel
Community Expert
Community Expert
March 20, 2024

It's not a table per se - it's TABdelimited text in text columns.

 

Well, his screenshot shows a table. But in the event that it's a plain (tab-delimited) column, a GREP expression can find those duplicate values.

 

Let's see what the OP replies.

Gioyer07Author
Inspiring
March 20, 2024

Correct. It is not a table. Just plain text. Now codes and pages have the same style but I will rebuild the index and assign a different style to the pages, so GREP search will be assigned on the codes only.

Peter Kahrel
Community Expert
Community Expert
March 20, 2024

There are GREP expressions to find identical values in a text, but in your case, numbers in a table, they won't work. But finding duplicate those values (and acting upon any found) is pretty easy to script. Excel isn't needed at all.

 

Can you indicate what should be done in the list in your screenshot? There are various duplicate numbers before the tab separator, followed by 3-digit numbers after the tab. What should be done about it?

Robert at ID-Tasker
Legend
March 20, 2024
quote

There are GREP expressions to find identical values in a text, but in your case, numbers in a table, they won't work. But finding duplicate those values (and acting upon any found) is pretty easy to script. Excel isn't needed at all.

 

Can you indicate what should be done in the list in your screenshot? There are various duplicate numbers before the tab separator, followed by 3-digit numbers after the tab. What should be done about it?


By @Peter Kahrel

 

It's not a table per se - it's TABdelimited text in text columns.

 

I think script would be overkill - doing it in Excel should be much easier - I mean OP would have full control over the text - won't have to wait for scripting person to make any extra changes... at least in this case - simple removal of duplicates.

 

Unless, OP wants to keep page numbers to build an Index - but still, it can be done in Excel...

 

...

 

Unless ... OP want's to convert those page numbers - into Hyperlinks / CrossReferences...

 

Robert at ID-Tasker
Legend
March 20, 2024
quote
[...]
 
Unless ... OP want's to convert those page numbers - into Hyperlinks / CrossReferences...

 

But in that case - InDesign can build Index automatically - or completely different script could be used:

 

https://creativepro.com/files/kahrel/indesign/lists_indexes.html

 

Robert at ID-Tasker
Legend
March 20, 2024

@Gioyer07

 

You have to be more specific. 

 

One after another - or in different places? 

 

Gioyer07Author
Inspiring
March 20, 2024

I'd need to search identical values in a long list of numbers, over 15.000 spread across several pages but in the same text frame. I would then replace/delete them with empty-nothing in order to achieve the desired number of pages of the book (that needs to be a multiple of 16 pages to match offset printing).

Well... to be more precise I'd also need to discriminate between numbers that are before the separator (or have more than three digits)

Robert at ID-Tasker
Legend
March 20, 2024

@Gioyer07

 

Then GREP alone won't be able to do this.

 

You would need to first get a list of all values.

 

Then - it depends on the tool you'll use...

 

And you mean the same Story - not the same Text Frame.