Copy link to clipboard
Copied
Hi,
Everyday I have to review several pdf documents, search for specific words that are not allowed, mistakes, typos, etc, and do comment annotations, either highlights, text boxes, insert text at cursor or replace.
I have found useful tool sequence that finds specific words in the pdf and highlights them, this is very helpful already but I want to take it to the next step.
I am trying to find a script that instead of doing the highlight annotation, it does the replace annotation and gives an alternate option to not permitted words, for example:
Lets say that the word "colour" is not permitted, it should say "color" instead. In the sequence I already have I input the "colour" in the list of words that should be highlighted, and after running the sequence the pdf would look like this:
"The colour is blue"
what I want is, instead of highlighting it, it should do the replace annotation and suggest the word "color" that I should've already inputted as a replacement everytime "colour" is found. The same for any given amount of word (I have a big list)
Here is the current script I am using to find and highlight:
// NOTE: This JavaScript code is intended soley for use in an
// Action (Batch Sequence)script. It will not operate
// within a document/form field script or as a folder level script.
// Further, this script require that the document be first marked up
// with redaction annotations.
//
// This script is the second step in a two step process for marking words in a
// PDF with the highlight annotation.
//
// Step #1: Use the Search for words to Redact tool to mark words in the
// PDF with the Redact Annotation
// Step #2: Use the following script to convert all Redact Annots into
// Highlight Annotataions.
// Highlight Color
var colHilite = color.yellow;
var oDoc = event.target;
var aAnnts = oDoc.getAnnots({sortBy:"Author"});
for(var i=0;i<aAnnts.length;i++)
{
if(aAnnts.type == "Redact")
{
aAnnts.type = "Highlight";
aAnnts.strokeColor = colHilite;
}
}
Here is the link of instructions:
https://acrobatusers.com/assets/uploads/actions/Find_and_Highlight_Words_and_Phrases.pdf
The find and highlight Action is a free download you'll find here:
https://acrobatusers.com/actions-exchange
Did you need different functionality? I'd be happy to modify the code for a fee. Please send me a message.
Copy link to clipboard
Copied
This is possible, but it's not a simple scripting task.
You would basically need to create a script that will read in a plain-text file with the list of words and then search the Redact annotations created by the Action, replacing them with Replace annotations and then searching the data from the file for the matching word to replace.
I've developed similar tools in the past for my clients so if you're interested in hiring someone to do it for you feel free to contact me privately via try6767 at gmail.com and we could discuss it further.
Copy link to clipboard
Copied
Hey I need this code for a similar purpose. can you help me too.
e-mail: r.r4774@gmail.com
Copy link to clipboard
Copied
The find and highlight Action is a free download you'll find here:
https://acrobatusers.com/actions-exchange
Did you need different functionality? I'd be happy to modify the code for a fee. Please send me a message.
Copy link to clipboard
Copied
Hi Thom Parker​
I realise my reply is well out of date, but I am in a similar situation to other people who have replied to your comments and your name seems to crop up a lot 🙂
I am in need of a script to do exactly what yours does, but in bulk, unattended. I have searched high and low and tried to do it myself, but struggled.
I have a list of .pdf's and 2 supplementary text files for each pdf containing 'good list' and 'bad list'.
I need to automate the process of:
Read in list of files
Open first file (file001.pdf)
Read in good list (file001G.txt)
Highlight words from good list in green
Read in bad list (file001B.txt)
Highlight words from bad list in yellow
Save file as filename + '_HL' .pdf
Loop to next file, lather, rinse, repeat....
My biggest problem is that I can't just use an Action Script in batch because each .pdf requires it's unique Good and Bad list of words.
The filenames above are purely an example and could be anything uniform, i.e.:
file001.pdf (original pdf)
file001.txt (good list)
file001.log (bad list)
Many thanks....
Copy link to clipboard
Copied
Hello JuJu,
It is certainly possible to modify the search and highlight action to meet your parameters, as long at the good and bad file names can be reliably derived from the original file name. If you look at the "Search and Highlight" Action script you'll notice that it is large and complicated. The changes for your features are also a bit on the complicated side, so this is not a small or trivial job. Were you planning on making the updates yourself, or hiring a developer?
Copy link to clipboard
Copied
Thanks for your reply Thom.
All the file names can be made to whatever I wish. I already have VB code which performs a series of tasks on a batch of files. I am currently using the iTextSharp library to parse / extract the textual content of the .pdf. It's pretty good at this (although not quite as accurate as the acrobat library). I tried using iTextSharp for the highlighting, however it is not putting the highlights in the right location. It is especially bad in landscape orientation .pdf's.
I had hoped to be able to code the solution myself, but I am from the 'school of VB .NET' and pretty much everything I have seen regarding pdf manipulation is either JavaScript or C++ so that is looking less and less likely and I well may need to hire a developer.
Copy link to clipboard
Copied
JavaScript is a simple and straight forward language, Much better then VB. You'll find materials for learning Acrobat JS here:
Core Language JavaScript is the same everywhere, but JS is used mostly for scripting web pages. So you can learn JavaScript from any of the HTML scripting resources online. However, since the document model(DOM) for HTML is very different from Acrobat, you have to be careful not to confuse the Core JavaScript with the DOM JS
Send me a message or email if you would like to engage a developer on this.
Copy link to clipboard
Copied
Thanks Thom, that's very useful information.
I will roll my sleeves up and dip a toe into the JavaScript pond.
The water's not that warm at the moment 🙂
Copy link to clipboard
Copied
Actually, there isn't a "replace text" markup annotation. instead there is a "StrikeOut" annotation with some properties that cause it to be drawn as replace text. You'll also need to change the input to a CSV that contains both the search word and the replacement.
I wrote the Search and Highlight Action you are referencing and would be happy to modify it for you.
Copy link to clipboard
Copied
Hey Mee in need of that code. can you?