Simple GREP deletion of lines

Explorer ,
Jun 30, 2022 Jun 30, 2022

Copy link to clipboard

Copied

I'd need a simple find and delete duplicate lines for a 16K items column.

VARIABLE NUMBERS(\t)PAGE NUMBER\p)

 

Thank you for help

TOPICS
How to

Views

310

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Jun 30, 2022 Jun 30, 2022

Copy link to clipboard

Copied

Hi @Gioyer07:

 

GREP is pattern-based. It looks like your pattern is:

a series of numbers, a tab, more numbers and a hard return.

 

If these are the only paragraphs in the file that follow that pattern, use:

 

Find what: \d+\t\d+\r

Change to:

 

Save the file first!

 

~Barb

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jul 01, 2022 Jul 01, 2022

Copy link to clipboard

Copied

Thank you Barb

My attempts to use your syntax failed. It probably is my fault because I failed to describe better the text I need to clean.

It has this structure:

011101850 17
011101850 17
011102250 17

The same three lines look in the finder as in the picture: syntax.png

I really appreciate your help. Thank you.

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jul 01, 2022 Jul 01, 2022

Copy link to clipboard

Copied

Thank you Barb

My attempts to use your syntax didn't work and it's due to my poor description.

The picture describes better how the lines are made

syntax.png

As you notice a "y" is used instead of "t"... but it's not working either. Thank you for your help.

 

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Jul 01, 2022 Jul 01, 2022

Copy link to clipboard

Copied

Hi @Gioyer07,

To understand the issue properly, are the duplicate lines always adjacent lines and we need to detect the duplicates and keep only a single instance?

-Manan

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jul 05, 2022 Jul 05, 2022

Copy link to clipboard

Copied

Excuse for the delay. No lines are not always adjacent. Thank you.

 

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Jul 02, 2022 Jul 02, 2022

Copy link to clipboard

Copied

To remove one following duplicate you can try something like that:

(\d+~y\d+\r)\K\1

if numbers tab numbers end of paragraph

 

If that does not work - please show a screenshot with visible hidden characters.

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jul 05, 2022 Jul 05, 2022

Copy link to clipboard

Copied

Thank You very much for the answer. Excuse the delay I had the office computer off. The screenshot precisely shows the hidden characters in the white box. Your solution seems to be right, please allow some time to duoblecheck it. Thank You

 

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Jul 05, 2022 Jul 05, 2022

Copy link to clipboard

Copied

quote

… The screenshot precisely shows the hidden characters in the white box…

 

By @Gioyer07

 

Are you sure? Which screenshot do you mean?

Both screenshots of you show guides and layout lines - but the hidden characters are not visible.

 

That mean: menu: Type --> Show hidden characters

[Ctrl/Strg]+[Alt]+[I]

 

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jul 05, 2022 Jul 05, 2022

Copy link to clipboard

Copied

Thank you... well if you look the guides and layout lines you certainly see them.... but in the find/replace window you can clearly see the hidden characters... perhaps you need to click it and see it full screen if you can't see it in the chatsyntax.png

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Jul 05, 2022 Jul 05, 2022

Copy link to clipboard

Copied

No.

That is not what I mean!

Go to menu: Type --> Show hidden characters (or similar)

Then the hidden characters will shown as blue characters/signs.

and create a new screenshot for us.

 

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jul 06, 2022 Jul 06, 2022

Copy link to clipboard

Copied

Thank you for patience, here is the correct screenshot.

I applyed

(\d+~y\d+\r)\K\1

 in the Find line and nothing in the Replace line.

Pressing 'Replace All' gave one substitution (while there are hundreds of exactly equal lines), pressing again 'Replace All' produced zero other findings/deletion.

 

 

 

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jul 06, 2022 Jul 06, 2022

Copy link to clipboard

Copied

HiddenChar.jpg

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Jul 06, 2022 Jul 06, 2022

Copy link to clipboard

Copied

Strange that the Grep finds anything at all. There are spaces between the numbers and the tab, which you did not mention before.

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Jul 06, 2022 Jul 06, 2022

Copy link to clipboard

Copied

Hi together,

I would try this one:

Find GREP:

(\d{7}\h{2}~y\d{3}\r)\1

Change GREP:

$1

 

Do that two or more times until nothing is found.

 

And let's hope that the pattern for all paragraphs* is:

A range of 7 digits followed by two horizontal white spaces followed by a right alligned tab followed by a range of 3 digits followed by a end of paragraph special character.

 

[*] I can see from the screenshot that we perhaps also have to deal with anchored objects.

 

Regards,
Uwe Laubender
( Adobe Community Professional )

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Jul 06, 2022 Jul 06, 2022

Copy link to clipboard

Copied

Hm.

Perhaps its better as a first step to get rid of possible white space characters between the first range of digits and the right align tab and then tackle the issue with duplicate paragraphs. I'm not sure if the number of white spaces after the first range of digits is always the same. ( Just a guess of course. )

 

Regards,
Uwe Laubender
( Adobe Community Professional )

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Jul 06, 2022 Jul 06, 2022

Copy link to clipboard

Copied

Hallo Uwe

Leider nein. So einfach ist es nicht. Sieh dir bitte noch einmal die ersten Screenshots an.

  • Ziffern (variabel zwischen 7-10)
  • Buchstabe (eventuell 1x)
  • Leerzeichen (eventuell mehrere)
  • verankertes Objekt (eventuell, Position unbekannt)
  • Tabulator (Zeilenspalter)
  • Ziffern (variabel Anzahl vielleicht bis 4 ??)
  • Absatzende

 

 

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Jul 06, 2022 Jul 06, 2022

Copy link to clipboard

Copied

Thank you, @pixxxelschubser !

You are right! Then we have to talk about "sameness" and the flexible patterns with \K. And what that means for atomic groups. Or we have to take a shortcut and write a script. 🙂

 

Let's see what our OP thinks about all this…

 

Thanks,
Uwe Laubender
( Adobe Community Professional )

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jul 08, 2022 Jul 08, 2022

Copy link to clipboard

Copied

This is beyond my skills... The only thing I am sure of is that I failed also the title, because it's not "Simple" indeed... Thank you to the experts that replies or propose usable solutions. At the moment I just can't say that it is "solved".

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jul 12, 2022 Jul 12, 2022

Copy link to clipboard

Copied

Excuse me if I insist... Would it be possible to tell GREP to ignore ANY object before checking for duplicates?

Would this reduce the complexity of the task? Thank you

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Jul 08, 2022 Jul 08, 2022

Copy link to clipboard

Copied

Hi @Gioyer07 ,

usually the issue with duplicate paragraphs should be solved with the data source.

So that duplicate paragraphs will not show up in InDesign after placing the content.

 

Regards,
Uwe Laubender
( Adobe Community Professional )

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jul 11, 2022 Jul 11, 2022

Copy link to clipboard

Copied

Thank you Laubender

That would be a solution if only backend data would know the page number of all the codes. Happens that knows not and never will but they are needed for printing.

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Jul 12, 2022 Jul 12, 2022

Copy link to clipboard

Copied

Please upload an 1 page example IDML (with no confidential data, but enough samples) on a hoster of your choise and link here.

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jul 18, 2022 Jul 18, 2022

Copy link to clipboard

Copied

LATEST

I uploaded it here using Wetransfer. The link is usually available only a week and I wonder if the 4 Mb document would be more useful to be uploaded somewhere else, with no expiry date for other potentially interested users. Thank You so much.

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines