Copy link to clipboard
Copied
Hi all, I'd like to know if there is a way to script or an existing method to find pages in a document that do not contain a certain word or phrase. I have a document that is 400 pages and each one should contain at least one instance of a specific word or phrase based on user prompt. Ideally the output would be a simple list of page numbers that are missing this word. Thanks in advance for any assistance.
Ok, you may or may not find this fairly advanced. The method document.getPageNthWord is the root of all text extraction and searching in JavaScript. You would step through your pages, and look at each word in turn (in a loop). You can do whatever tests you like, such as "string does not match any of the words on a page", and take the action you want. Making output is also something of a challenge because of strong limits on what JavaScript can do, for security reasons.
Copy link to clipboard
Copied
Are you a JavaScript programmer?
Copy link to clipboard
Copied
In process of learning JavaScript and how it can be integrated within an Action Wizard
Copy link to clipboard
Copied
Ok, you may or may not find this fairly advanced. The method document.getPageNthWord is the root of all text extraction and searching in JavaScript. You would step through your pages, and look at each word in turn (in a loop). You can do whatever tests you like, such as "string does not match any of the words on a page", and take the action you want. Making output is also something of a challenge because of strong limits on what JavaScript can do, for security reasons.
Copy link to clipboard
Copied
In Theory, this is possible with a script, although 400 pages is pushing the limit of what a script in Acrobat can handle, from my experience. The alternative is to use a stand-alone tool, which is more complicated to develop, but much more robust.
Copy link to clipboard
Copied
Thanks for the advice and input - greatly appreciated! Sorry for the questions but the first time I've seen JavaScript was last week;) A followup question: how does the code receive the variable input from the User Prompt in the first step of the Adobe Action (Search & Remove Text)? Here is the start of my initial code which will be under the Execute JavaScript step:
// Looks over all pages and find a given string and
// displays page numbers that do not have this string
var stringToFind = <how does the variable link here from user prompt in the Action?>
for (var p = 0; p < this.numPages; p++) {
// iterate over all words
for (var n = 0; n < this.getPageNumWords(p); n++) {
if (this.getPageNthWord(p, n) != stringToFind)
Copy link to clipboard
Copied
It doesn't. If you use the Search & Remove Text command then your approach needs to be completely different.
That command creates Redaction annotations over the matching it terms. You need to then look for those annotations in your script and based on their locations you could find out which pages don't containt the text you searched for.