• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Simple GREP deletion of lines

Participant ,
Jun 30, 2022 Jun 30, 2022

Copy link to clipboard

Copied

I'd need a simple find and delete duplicate lines for a 16K items column.

VARIABLE NUMBERS(\t)PAGE NUMBER\p)

 

Thank you for help

TOPICS
How to

Views

568

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Participant , Mar 29, 2023 Mar 29, 2023

Solution (for newbies like me) has to follow these steps

1) clean the document from spaces, tabs etc. (use show hidden characters).

2) apply grep

find

^(.+\r)\1+
replace
$1

Votes

Translate

Translate
Community Expert ,
Jun 30, 2022 Jun 30, 2022

Copy link to clipboard

Copied

Hi @Gioyer07:

 

GREP is pattern-based. It looks like your pattern is:

a series of numbers, a tab, more numbers and a hard return.

 

If these are the only paragraphs in the file that follow that pattern, use:

 

Find what: \d+\t\d+\r

Change to:

 

Save the file first!

 

~Barb

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jul 01, 2022 Jul 01, 2022

Copy link to clipboard

Copied

Thank you Barb

My attempts to use your syntax failed. It probably is my fault because I failed to describe better the text I need to clean.

It has this structure:

011101850 17
011101850 17
011102250 17

The same three lines look in the finder as in the picture: syntax.png

I really appreciate your help. Thank you.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jul 01, 2022 Jul 01, 2022

Copy link to clipboard

Copied

Thank you Barb

My attempts to use your syntax didn't work and it's due to my poor description.

The picture describes better how the lines are made

syntax.png

As you notice a "y" is used instead of "t"... but it's not working either. Thank you for your help.

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 01, 2022 Jul 01, 2022

Copy link to clipboard

Copied

Hi @Gioyer07,

To understand the issue properly, are the duplicate lines always adjacent lines and we need to detect the duplicates and keep only a single instance?

-Manan

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jul 05, 2022 Jul 05, 2022

Copy link to clipboard

Copied

Excuse for the delay. No lines are not always adjacent. Thank you.

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 02, 2022 Jul 02, 2022

Copy link to clipboard

Copied

To remove one following duplicate you can try something like that:

(\d+~y\d+\r)\K\1

if numbers tab numbers end of paragraph

 

If that does not work - please show a screenshot with visible hidden characters.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jul 05, 2022 Jul 05, 2022

Copy link to clipboard

Copied

Thank You very much for the answer. Excuse the delay I had the office computer off. The screenshot precisely shows the hidden characters in the white box. Your solution seems to be right, please allow some time to duoblecheck it. Thank You

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 05, 2022 Jul 05, 2022

Copy link to clipboard

Copied

quote

… The screenshot precisely shows the hidden characters in the white box…

 

By @Gioyer07

 

Are you sure? Which screenshot do you mean?

Both screenshots of you show guides and layout lines - but the hidden characters are not visible.

 

That mean: menu: Type --> Show hidden characters

[Ctrl/Strg]+[Alt]+[I]

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jul 05, 2022 Jul 05, 2022

Copy link to clipboard

Copied

Thank you... well if you look the guides and layout lines you certainly see them.... but in the find/replace window you can clearly see the hidden characters... perhaps you need to click it and see it full screen if you can't see it in the chatsyntax.png

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 05, 2022 Jul 05, 2022

Copy link to clipboard

Copied

No.

That is not what I mean!

Go to menu: Type --> Show hidden characters (or similar)

Then the hidden characters will shown as blue characters/signs.

and create a new screenshot for us.

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jul 06, 2022 Jul 06, 2022

Copy link to clipboard

Copied

Thank you for patience, here is the correct screenshot.

I applyed

(\d+~y\d+\r)\K\1

 in the Find line and nothing in the Replace line.

Pressing 'Replace All' gave one substitution (while there are hundreds of exactly equal lines), pressing again 'Replace All' produced zero other findings/deletion.

 

 

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jul 06, 2022 Jul 06, 2022

Copy link to clipboard

Copied

HiddenChar.jpg

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 06, 2022 Jul 06, 2022

Copy link to clipboard

Copied

Strange that the Grep finds anything at all. There are spaces between the numbers and the tab, which you did not mention before.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 06, 2022 Jul 06, 2022

Copy link to clipboard

Copied

Hi together,

I would try this one:

Find GREP:

(\d{7}\h{2}~y\d{3}\r)\1

Change GREP:

$1

 

Do that two or more times until nothing is found.

 

And let's hope that the pattern for all paragraphs* is:

A range of 7 digits followed by two horizontal white spaces followed by a right alligned tab followed by a range of 3 digits followed by a end of paragraph special character.

 

[*] I can see from the screenshot that we perhaps also have to deal with anchored objects.

 

Regards,
Uwe Laubender
( Adobe Community Professional )

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 06, 2022 Jul 06, 2022

Copy link to clipboard

Copied

Hm.

Perhaps its better as a first step to get rid of possible white space characters between the first range of digits and the right align tab and then tackle the issue with duplicate paragraphs. I'm not sure if the number of white spaces after the first range of digits is always the same. ( Just a guess of course. )

 

Regards,
Uwe Laubender
( Adobe Community Professional )

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 06, 2022 Jul 06, 2022

Copy link to clipboard

Copied

Hallo Uwe

Leider nein. So einfach ist es nicht. Sieh dir bitte noch einmal die ersten Screenshots an.

  • Ziffern (variabel zwischen 7-10)
  • Buchstabe (eventuell 1x)
  • Leerzeichen (eventuell mehrere)
  • verankertes Objekt (eventuell, Position unbekannt)
  • Tabulator (Zeilenspalter)
  • Ziffern (variabel Anzahl vielleicht bis 4 ??)
  • Absatzende

 

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 06, 2022 Jul 06, 2022

Copy link to clipboard

Copied

Thank you, @pixxxelschubser !

You are right! Then we have to talk about "sameness" and the flexible patterns with \K. And what that means for atomic groups. Or we have to take a shortcut and write a script. 🙂

 

Let's see what our OP thinks about all this…

 

Thanks,
Uwe Laubender
( Adobe Community Professional )

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jul 08, 2022 Jul 08, 2022

Copy link to clipboard

Copied

This is beyond my skills... The only thing I am sure of is that I failed also the title, because it's not "Simple" indeed... Thank you to the experts that replies or propose usable solutions. At the moment I just can't say that it is "solved".

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jul 12, 2022 Jul 12, 2022

Copy link to clipboard

Copied

Excuse me if I insist... Would it be possible to tell GREP to ignore ANY object before checking for duplicates?

Would this reduce the complexity of the task? Thank you

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 08, 2022 Jul 08, 2022

Copy link to clipboard

Copied

Hi @Gioyer07 ,

usually the issue with duplicate paragraphs should be solved with the data source.

So that duplicate paragraphs will not show up in InDesign after placing the content.

 

Regards,
Uwe Laubender
( Adobe Community Professional )

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jul 11, 2022 Jul 11, 2022

Copy link to clipboard

Copied

Thank you Laubender

That would be a solution if only backend data would know the page number of all the codes. Happens that knows not and never will but they are needed for printing.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 12, 2022 Jul 12, 2022

Copy link to clipboard

Copied

Please upload an 1 page example IDML (with no confidential data, but enough samples) on a hoster of your choise and link here.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jul 18, 2022 Jul 18, 2022

Copy link to clipboard

Copied

I uploaded it here using Wetransfer. The link is usually available only a week and I wonder if the 4 Mb document would be more useful to be uploaded somewhere else, with no expiry date for other potentially interested users. Thank You so much.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Mar 29, 2023 Mar 29, 2023

Copy link to clipboard

Copied

LATEST

Solution (for newbies like me) has to follow these steps

1) clean the document from spaces, tabs etc. (use show hidden characters).

2) apply grep

find

^(.+\r)\1+
replace
$1

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines