• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
7

GREP find and replace identical values

Participant ,
Mar 20, 2024 Mar 20, 2024

Copy link to clipboard

Copied

Hi Knowledge People

What is the simplest syntax to find identical "values" (may be words or numbers) in a document?

Thanks

TOPICS
Scripting

Views

584

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 2 Correct answers

Community Expert , Mar 20, 2024 Mar 20, 2024

You can do it with three queries. First mark all lines with identical first values, don't mark the first occurrence:

Find what: ^(.+?~y).+\r\K(\1.+\r)+
Change to: <Leave empty>
Find format: <Leave empty>
Change format: +Strikethrough

Then delete everything that has strikethru applied up to the tab:

Find what: ^.+?(?=~y)
Change to: <Leave empty>
Find format: +Strikethrough
Change format: <Leave empty>

Finally, remove all strikethrough:

Find what: <Leave empty>
Change to: <Leave empty>
Find format
...

Votes

Translate

Translate
Community Expert , Mar 20, 2024 Mar 20, 2024

Thanks for the file. When I tried the GREP queries on your (big) file I noticed that those pesky inline graphics get in the way. They can't be accommodated in the queries I gave, so you need two additional queries.

 

This one goes first, before anything else. It places a tab before an inline:

Find what: (?=~a)
Change to: ~y
Find format: <Leave empty>
Change format: <Leave empty>

 Then run this one at the very end to delete the tabs before the inlines:

Find what: ~y(?=~a)
Change to: <Leave empty>
...

Votes

Translate

Translate
Community Expert ,
Mar 20, 2024 Mar 20, 2024

Copy link to clipboard

Copied

@Gioyer07

 

You have to be more specific. 

 

One after another - or in different places? 

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Mar 20, 2024 Mar 20, 2024

Copy link to clipboard

Copied

I'd need to search identical values in a long list of numbers, over 15.000 spread across several pages but in the same text frame. I would then replace/delete them with empty-nothing in order to achieve the desired number of pages of the book (that needs to be a multiple of 16 pages to match offset printing).

samenumbers.png

Well... to be more precise I'd also need to discriminate between numbers that are before the separator (or have more than three digits)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 20, 2024 Mar 20, 2024

Copy link to clipboard

Copied

@Gioyer07

 

Then GREP alone won't be able to do this.

 

You would need to first get a list of all values.

 

Then - it depends on the tool you'll use...

 

And you mean the same Story - not the same Text Frame. 

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Mar 20, 2024 Mar 20, 2024

Copy link to clipboard

Copied

Never mind about any other value after the codes: they are related to the page numbers and can be discriminated differentiating the character styles. I just need a syntax that would find matching identical numbers. Thank you.

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 20, 2024 Mar 20, 2024

Copy link to clipboard

Copied

@Gioyer07

 

Are they sorted already? Or random order ? 

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Mar 20, 2024 Mar 20, 2024

Copy link to clipboard

Copied

They are only partially sorted. Usually is the last operation I make... there's a script I've installed for this purpose.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 20, 2024 Mar 20, 2024

Copy link to clipboard

Copied

@Gioyer07

 

If you'll sort them anyway - you could copy everything to Excel and remove duplicates there.

 

The only "problem" - this "NEW" tag - it's anchored / inline, right? 

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Mar 20, 2024 Mar 20, 2024

Copy link to clipboard

Copied

Yes. The NEW image would be lost...

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 20, 2024 Mar 20, 2024

Copy link to clipboard

Copied

quote

Yes. The NEW image would be lost...


By @Gioyer07

 

I'm pretty sure you can search for anchored / inline marker, replace it with some unique text - go to Excel and do what you need - then replace this unique "marker" with contents of the Clipboard - this NEW tag. 

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 20, 2024 Mar 20, 2024

Copy link to clipboard

Copied

quote

Yes. The NEW image would be lost...


By @Gioyer07

 

RobertTkaczyk_0-1710943958851.png

 

RobertTkaczyk_1-1710943972343.png

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 20, 2024 Mar 20, 2024

Copy link to clipboard

Copied

quote

Yes. The NEW image would be lost...


By @Gioyer07

 

Or you should rather do this:

RobertTkaczyk_0-1710944185534.png

 

1) replace "^t" -> "^t^t"

2) replace "^a^t" -> "^t#NEW#"

 

so you'll have your "tag" in a separate column so it won't affect sorting and you'll have all your number as values - not text.

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 20, 2024 Mar 20, 2024

Copy link to clipboard

Copied

quote
quote

Yes. The NEW image would be lost...


By @Gioyer07

 

Or you should rather do this:

RobertTkaczyk_0-1710944185534.png

 

1) replace "^t" -> "^t^t"

2) replace "^a^t" -> "^t#NEW#"

 

so you'll have your "tag" in a separate column so it won't affect sorting and you'll have all your number as values - not text.

 


By @Robert at ID-Tasker

 

Then:

RobertTkaczyk_1-1710944913430.png

 

Of course you would've to first copy into Clipboard your graphic representation of this NEW tag.

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 20, 2024 Mar 20, 2024

Copy link to clipboard

Copied

There are GREP expressions to find identical values in a text, but in your case, numbers in a table, they won't work. But finding duplicate those values (and acting upon any found) is pretty easy to script. Excel isn't needed at all.

 

Can you indicate what should be done in the list in your screenshot? There are various duplicate numbers before the tab separator, followed by 3-digit numbers after the tab. What should be done about it?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 20, 2024 Mar 20, 2024

Copy link to clipboard

Copied

quote

There are GREP expressions to find identical values in a text, but in your case, numbers in a table, they won't work. But finding duplicate those values (and acting upon any found) is pretty easy to script. Excel isn't needed at all.

 

Can you indicate what should be done in the list in your screenshot? There are various duplicate numbers before the tab separator, followed by 3-digit numbers after the tab. What should be done about it?


By @Peter Kahrel

 

It's not a table per se - it's TABdelimited text in text columns.

 

I think script would be overkill - doing it in Excel should be much easier - I mean OP would have full control over the text - won't have to wait for scripting person to make any extra changes... at least in this case - simple removal of duplicates.

 

Unless, OP wants to keep page numbers to build an Index - but still, it can be done in Excel...

 

...

 

Unless ... OP want's to convert those page numbers - into Hyperlinks / CrossReferences...

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 20, 2024 Mar 20, 2024

Copy link to clipboard

Copied

quote
[...]
 
Unless ... OP want's to convert those page numbers - into Hyperlinks / CrossReferences...

 

But in that case - InDesign can build Index automatically - or completely different script could be used:

 

https://creativepro.com/files/kahrel/indesign/lists_indexes.html

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Mar 20, 2024 Mar 20, 2024

Copy link to clipboard

Copied

The expected result would be to replace with empty all the digits after the first one (never mind about any other value after the codes: they are related to the page numbers and can be discriminated differentiating the character styles).

samenumbers.pngsamenumbers2.png

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 20, 2024 Mar 20, 2024

Copy link to clipboard

Copied

quote

The expected result would be to replace with empty all the digits after the first one (never mind about any other value after the codes: they are related to the page numbers and can be discriminated differentiating the character styles).


samenumbers2.png

 

By @Gioyer07

 

Then as @Peter Kahrel said - GREP should be able to find duplicates - ignoring texts in between.

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 20, 2024 Mar 20, 2024

Copy link to clipboard

Copied

It's not a table per se - it's TABdelimited text in text columns.

 

Well, his screenshot shows a table. But in the event that it's a plain (tab-delimited) column, a GREP expression can find those duplicate values.

 

Let's see what the OP replies.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Mar 20, 2024 Mar 20, 2024

Copy link to clipboard

Copied

Correct. It is not a table. Just plain text. Now codes and pages have the same style but I will rebuild the index and assign a different style to the pages, so GREP search will be assigned on the codes only.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 20, 2024 Mar 20, 2024

Copy link to clipboard

Copied

quote

It's not a table per se - it's TABdelimited text in text columns.

 

Well, his screenshot shows a table. But in the event that it's a plain (tab-delimited) column, a GREP expression can find those duplicate values.

 

Let's see what the OP replies.


By @Peter Kahrel

 

Those lines - is just Underline.

 

Table would have "#" at the end of the "line":

RobertTkaczyk_0-1710945618359.png

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 20, 2024 Mar 20, 2024

Copy link to clipboard

Copied

You can do it with three queries. First mark all lines with identical first values, don't mark the first occurrence:

Find what: ^(.+?~y).+\r\K(\1.+\r)+
Change to: <Leave empty>
Find format: <Leave empty>
Change format: +Strikethrough

Then delete everything that has strikethru applied up to the tab:

Find what: ^.+?(?=~y)
Change to: <Leave empty>
Find format: +Strikethrough
Change format: <Leave empty>

Finally, remove all strikethrough:

Find what: <Leave empty>
Change to: <Leave empty>
Find format: +Strikethrough
Change format: -Strikethrough

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Mar 20, 2024 Mar 20, 2024

Copy link to clipboard

Copied

Thank you Peter

I uploaded the IDD document here. 

I should have done it straight at the beginning.

Great help and kindness.

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 20, 2024 Mar 20, 2024

Copy link to clipboard

Copied

LATEST
quote

Thank you Peter

I uploaded the IDD document here. 

I should have done it straight at the beginning.

Great help and kindness.


By @Gioyer07

 

You need to do "purge" / "garbage collection" - do SAVE AS with a new name from time to time.

 

 

OK, looks like you've done Save As earlier today:

 

Recovered MiniSave on Windows x64 10.0 in app version 17.4.1.67 (FS InDesign Roman) build 67 on 04 April 2023 at 11:15
Save As on Windows x64 10.0 in app version 17.4.1.67 (FS InDesign Roman) build 67 on 20 March 2024 at 09:05
Book - repaginate on Windows x64 10.0 in app version 17.4.1.67 (FS InDesign Roman) build 67 on 20 March 2024 at 09:08
Book - repaginate on Windows x64 10.0 in app version 17.4.1.67 (FS InDesign Roman) build 67 on 20 March 2024 at 09:17
Book - repaginate on Windows x64 10.0 in app version 17.4.1.67 (FS InDesign Roman) build 67 on 20 March 2024 at 09:41
Book - repaginate on Windows x64 10.0 in app version 17.4.1.67 (FS InDesign Roman) build 67 on 20 March 2024 at 15:06
Book - repaginate on Windows x64 10.0 in app version 17.4.1.67 (FS InDesign Roman) build 67 on 20 March 2024 at 15:06
Book - repaginate on Windows x64 10.0 in app version 17.4.1.67 (FS InDesign Roman) build 67 on 20 March 2024 at 15:07
Book - repaginate on Windows x64 10.0 in app version 17.4.1.67 (FS InDesign Roman) build 67 on 20 March 2024 at 15:07
Book - repaginate on Windows x64 10.0 in app version 17.4.1.67 (FS InDesign Roman) build 67 on 20 March 2024 at 15:09
Most recent Save on Windows x64 10.0 in app version 17.4.1.67 (FS InDesign Roman) build 67 on 20 March 2024 at 15:09
Open As Copy on Windows x64 10.0 in app version 19.3.0.58 (FS InDesign Roman) build 58 on 20 March 2024 at 15:41
Converted on Windows x64 10.0 in app version 19.3.0.58 (FS InDesign Roman) build 58 on 20 March 2024 at 15:41
Save As on Windows x64 10.0 in app version 19.3.0.58 (FS InDesign Roman) build 58 on 20 March 2024 at 15:42

 

But those repaginations added 70MB??

 

RobertTkaczyk_0-1710949649772.png

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 20, 2024 Mar 20, 2024

Copy link to clipboard

Copied

Thanks for the file. When I tried the GREP queries on your (big) file I noticed that those pesky inline graphics get in the way. They can't be accommodated in the queries I gave, so you need two additional queries.

 

This one goes first, before anything else. It places a tab before an inline:

Find what: (?=~a)
Change to: ~y
Find format: <Leave empty>
Change format: <Leave empty>

 Then run this one at the very end to delete the tabs before the inlines:

Find what: ~y(?=~a)
Change to: <Leave empty>
Find format: <Leave empty>
Change format: <Leave empty>

So that's five queries altogether. If you use these more than once you can use this script to run them altogether:

https://creativepro.com/files/kahrel/indesign/grep_query_runner.html

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines