How to delete a page containing specific words?

Community Beginner ,
Apr 30, 2018

Copy link to clipboard

Copied

Hello

I'm new to scripting and new in ADOBE DC,

so far I have understood how to run a code through the console.

I have found thanks to By Karl Heinz Kremer this code :

for (var p=this.numPages-1; p>=0; p--) {

    for (var n=0; n<this.getPageNumWords(p); n++) {

       var theCurrentWord = this.getPageNthWord(p, n);

       if (theCurrentWord == "9 - 10 ANS") {

             this.deletePages(p);

             break;

       }

       else if (theCurrentWord == "MOIS") {

             this.deletePages(p);

             break;

       }

    }

}

it works very well, but it takes a very long time to run. and I have 7 to 10 more words to loop in the circuit... so it will take longer...

Is there any way (code !) to point out where to look for, because my information is always written at the same place (see here below). This would save time, wouldn't it?

If anyone can help, I would really appreciate: I have around a 100 files to clean up like that and one file takes 1hour and 40 min if I delete it manually .

thanks

Stef

TOPICS
Acrobat SDK and JavaScript, Windows

Views

236

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more

How to delete a page containing specific words?

Community Beginner ,
Apr 30, 2018

Copy link to clipboard

Copied

Hello

I'm new to scripting and new in ADOBE DC,

so far I have understood how to run a code through the console.

I have found thanks to By Karl Heinz Kremer this code :

for (var p=this.numPages-1; p>=0; p--) {

    for (var n=0; n<this.getPageNumWords(p); n++) {

       var theCurrentWord = this.getPageNthWord(p, n);

       if (theCurrentWord == "9 - 10 ANS") {

             this.deletePages(p);

             break;

       }

       else if (theCurrentWord == "MOIS") {

             this.deletePages(p);

             break;

       }

    }

}

it works very well, but it takes a very long time to run. and I have 7 to 10 more words to loop in the circuit... so it will take longer...

Is there any way (code !) to point out where to look for, because my information is always written at the same place (see here below). This would save time, wouldn't it?

If anyone can help, I would really appreciate: I have around a 100 files to clean up like that and one file takes 1hour and 40 min if I delete it manually .

thanks

Stef

TOPICS
Acrobat SDK and JavaScript, Windows

Views

237

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Apr 30, 2018 0
Most Valuable Participant ,
Apr 30, 2018

Copy link to clipboard

Copied

No, it would take longer. You can get the location of every word along with the text of every word, but checking the position of every word will take longer than looking at the text of each one. This is about as efficient as it can be done in JavaScript.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Apr 30, 2018 0
Community Beginner ,
Apr 30, 2018

Copy link to clipboard

Copied

many thanks for your answer.

If this is not possible, do you advise me to loop all the information at once creating a scipt looking like that:

for (var p=this.numPages-1; p>=0; p--) {

    for (var n=0; n<this.getPageNumWords(p); n++) {

       var theCurrentWord = this.getPageNthWord(p, n);

       if (theCurrentWord == "10 ANS") {

             this.deletePages(p);

             break;

       }

       else if (theCurrentWord == "12 ANS") {

             this.deletePages(p);

             break;

       }

       else if (theCurrentWord == "9 ANS") {

             this.deletePages(p);

             break;

       }

       else if (theCurrentWord == "8 ANS") {

             this.deletePages(p);

             break;

       }

       else if (theCurrentWord == "7 ANS") {

             this.deletePages(p);

             break;

       }

       else if (theCurrentWord == "14 ANS") {

             this.deletePages(p);

             break;

       }

       else if (theCurrentWord == "16 ANS") {

             this.deletePages(p);

             break;

       }

       else if (theCurrentWord == "18 ANS") {

             this.deletePages(p);

             break;

       }

       else if (theCurrentWord == "20 ANS") {

             this.deletePages(p);

             break;

       }

    }

}

or do you advise me to run them 2 by 2 or 3 by 3?

For information, so far, I have managed only to run it once with only 2 ''RESEARCHED WORDS'' as in my first script.

waiting for your advises!

Good day to you

Stef

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Apr 30, 2018 0
Adobe Community Professional ,
Apr 30, 2018

Copy link to clipboard

Copied

Info: you will never find a word like "9 - 10 ANS".

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Apr 30, 2018 1
Community Beginner ,
Apr 30, 2018

Copy link to clipboard

Copied

sorry , but I did run a Script with this ''9 - 10 ANS''  ''Word'' and it did remove my pages !!

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Apr 30, 2018 0
Most Valuable Participant ,
May 01, 2018

Copy link to clipboard

Copied

https://forums.adobe.com/people/Bernd+Alheit  schrieb

Info: you will never find a word like "9 - 10 ANS".

You are totally right. But he always will find 10 ANS …

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
May 01, 2018 0
Most Valuable Participant ,
Apr 30, 2018

Copy link to clipboard

Copied

That looks OK. For general efficiency I'd move this.getPageNumWords(p) before the inner loop and put it in a variable. To speed up efficiency for this use case, I'd be inclined to check if theCurrentWord ends in the string " ANS", because you will quickly eliminate most mismatches.

Page deletion is expensive, if you often delete blocks of consecutive pages it might be worth combining them. Complicated code though.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Apr 30, 2018 1
Community Beginner ,
Apr 30, 2018

Copy link to clipboard

Copied

it is very nice of you to help me.

this is my very first contact with Coding and I am not sure to be able to apply you suggestions.

is it that what you mean?

for (var n=0; n<this.getPageNumWords(p); n++ {

    for (var p=this.numPages-1; p>=0; p--) {

      var theCurrentWord = this.getPageNthWord(p, n);

      if (theCurrentWord == "9 - 10 ANS") {

            this.deletePages(p);

            break;

      }

      else if (theCurrentWord == "MOIS") {

            this.deletePages(p);

            break;

      }

    }

}

Stef

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Apr 30, 2018 0
Most Valuable Participant ,
Apr 30, 2018

Copy link to clipboard

Copied

Hi StefNegoce​,

I'm not an Acrobat scripter.

But IMHO better ways exists. Like the following example. The snippet checks if any of the requested strings are included. The alert shows you the result.

// define some strings

var arr = [["7 ANS",], ["8 ANS",], ["9 ANS",], ["10 ANS",], ["12 ANS",], ["14 ANS",], ["16 ANS",], ["18 ANS",], ["20 ANS",], ["3 ANS",], ["15 ANS",], ["101 ANS",], ["10 anS",] ];

// this regex should find the 7,8,9,10,12,14,16,20 ANS and not find 3,15,101 ANS and also not 10 anS

var reg = /\b[12]?[0246-9]\sANS\b/;

for (i=0; i<arr.length; i++) {

    arr[1] = " | This Regex will find = " + reg.test(arr[0]);

}

var arr2 = arr.join("\r"); alert (arr2);

Try the snippet. (It works standalone.)

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Apr 30, 2018 0
Community Beginner ,
Apr 30, 2018

Copy link to clipboard

Copied

Hello pixxxel schubser

that looks interesting!

to make sure I understand well here are my questions:

1/ - in your line : var reg = /\b[12]?[0246-9]\sANS\b/;

   -> 12 is the number of variables in my string in Base zero (because in your example there are 13 datas in this string)

   -> 9 points out the 10 first data of the string to be find out.

But what is 0246 then?

2/ -instead of getting an answer can I give the information to delete the different pages containing those elected ''WORDS'' ?

anyhow thanks for helping!

Stef

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Apr 30, 2018 0
Most Valuable Participant ,
May 01, 2018

Copy link to clipboard

Copied

\b[12]?[0246-9]\sANS\b  means:

\b  Word boundary

[  Begin of character class

1  Digit 1  or

2  Digit 2

]  End of character class

?  Zero or one time

[  Begin of character class

0  Digit 0  or

2  Digit 2  or

4  Digit 4  or

6  Digit 6  or

-  Till

9  Digit 9

]  End of character class

\s  A space

A  Letter A

N  Letter N

S  Letter S

\b  Word boundary

For a better understanding:

var reg = /\b\d{1,2}\sANS\b/;

is something more general.

Try this snippet. It should be somewhat more helpful in your case:

// nothing to find in this string

var str1 = "Lorem ipsum Lorem ipsum Lorem ipsum Lorem ipsum Lorem ipsum Lorem ipsum Lorem ipsum Lorem ipsum";

var reg = /(\b[12]?[0246-9]\sANS\b)/;

if (reg.test(str1) == true) {

    // do something

    //alert ("Found " + reg + " = "+ reg.test(str1) + " in\n" + str1);

    alert ("Found " + reg + " = "+ RegExp.$1 + " in first string:\n" + str1);

}  else {

    alert ("Nothing found in first string:\n" + str1);

    }

// should find 7 ANS in the string

str1 = "Lorem ipsum Lorem ipsum 7 ANS Lorem ipsum Lorem ipsum Lorem ipsum Lorem ipsum Lorem ipsum Lorem ipsum";

if (reg.test(str1) == true) {

    // do something

    //alert ("Found " + reg + " = "+ reg.test(str1) + " in\n" + str1);

    alert ("Found " + reg + " = "+ RegExp.$1 + " in second string:\n" + str1);

}  else {

    alert ("Nothing found in second string:\n" + str1);

    }

Have fun

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
May 01, 2018 0
Community Beginner ,
May 01, 2018

Copy link to clipboard

Copied

Hi pixxxel schubser ,

it is so nice of you teaching us!

I'm gonna try out some codes, and get back to you!

Stef

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
May 01, 2018 1
Adobe Community Professional ,
May 01, 2018

Copy link to clipboard

Copied

First crop each page to the area of interest, then do your search. When done, uncrop each page. You should also make sure it's not possible to attempt to delete all pages, since you must have at least one.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
May 01, 2018 0
Adobe Community Professional ,
May 01, 2018

Copy link to clipboard

Copied

You can crop pages using the doc.setPageBoxes JavaScript method.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
May 01, 2018 0
Community Beginner ,
May 01, 2018

Copy link to clipboard

Copied

Hello George_Johnson

thank you for this idea. I'll try it out.

Stef

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
May 01, 2018 0
Adobe Community Professional ,
May 01, 2018

Copy link to clipboard

Copied

For testing, it will be easier to manually crop/uncrop. If that speeds things up considerably, which it should, then you can figure out how to do it via JavaScript.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
May 01, 2018 0