• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Search for missing word on a page within a document

Community Beginner ,
Nov 21, 2019 Nov 21, 2019

Copy link to clipboard

Copied

Hi all,  I'd like to know if there is a way to script or an existing method to find pages in a document that do not contain a certain word or phrase.  I have a document that is 400 pages and each one should contain at least one instance of a specific word or phrase based on user prompt.  Ideally the output would be a simple list of page numbers that are missing this word.  Thanks in advance for any assistance.

TOPICS
Acrobat SDK and JavaScript

Views

312

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

LEGEND , Nov 21, 2019 Nov 21, 2019

Ok, you may or may not find this fairly advanced. The method document.getPageNthWord is the root of all text extraction and searching in JavaScript. You would step through your pages, and look at each word in turn (in a loop). You can do whatever tests you like, such as "string does not match any of the words on a page", and take the action you want. Making output is also something of a challenge because of strong limits on what JavaScript can do, for security reasons.

Votes

Translate

Translate
LEGEND ,
Nov 21, 2019 Nov 21, 2019

Copy link to clipboard

Copied

Are you a JavaScript programmer?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Nov 21, 2019 Nov 21, 2019

Copy link to clipboard

Copied

In process of learning JavaScript and how it can be integrated within an Action Wizard

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Nov 21, 2019 Nov 21, 2019

Copy link to clipboard

Copied

Ok, you may or may not find this fairly advanced. The method document.getPageNthWord is the root of all text extraction and searching in JavaScript. You would step through your pages, and look at each word in turn (in a loop). You can do whatever tests you like, such as "string does not match any of the words on a page", and take the action you want. Making output is also something of a challenge because of strong limits on what JavaScript can do, for security reasons.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Nov 21, 2019 Nov 21, 2019

Copy link to clipboard

Copied

In Theory, this is possible with a script, although 400 pages is pushing the limit of what a script in Acrobat can handle, from my experience. The alternative is to use a stand-alone tool, which is more complicated to develop, but much more robust.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Nov 27, 2019 Nov 27, 2019

Copy link to clipboard

Copied

Thanks for the advice and input - greatly appreciated!  Sorry for the questions but the first time I've seen JavaScript was last week;)    A followup question:  how does the code receive the variable input from the User Prompt in the first step of the Adobe Action (Search & Remove Text)?  Here is the start of my initial code which will be under the Execute JavaScript step:

 

// Looks over all pages and find a given string and  

// displays page numbers that do not have this string

 

var stringToFind = <how does the variable link here from user prompt in the Action?>

 

for (var p = 0; p < this.numPages; p++) {

     // iterate over all words

     for (var n = 0; n < this.getPageNumWords(p); n++) {

           if (this.getPageNthWord(p, n) != stringToFind)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Nov 27, 2019 Nov 27, 2019

Copy link to clipboard

Copied

LATEST

It doesn't. If you use the Search & Remove Text command then your approach needs to be completely different.

That command creates Redaction annotations over the matching it terms. You need to then look for those annotations in your script and based on their locations you could find out which pages don't containt the text you searched for.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines