Copy link to clipboard
Copied
I need to be able to grab the invoice number from pdfs and add to filename. Customer always sends their invoices in the same format. Is there a way to get the text from the pdf and add it to the filename while resaving the document?
I am using DC professional
1 Correct answer
If you know exactly where the text is, you can crop the page down to just that portion, and then iterate over all words in that area using Doc.getPageNthWord() (Acrobat DC SDK Documentation​) you should be able to extract just the text you are interested in. If you look through the archives, and search for getPageNthWord, you should find a number of examples.
Copy link to clipboard
Copied
1. I don't like the look of trying to save as test. Even if it succeeds it will just be called test and won't automatically open in Acrobat. Try test.pdf.
2. Are you able to save to the folder "O:\1_invoice staging" manually?
Copy link to clipboard
Copied
Okay,
Adding the ".pdf" extension to the code makes it work in the console window. So, executing that line with named test file works. I can't use the script line exactly because the filename contains one variable and a user entered value, (+ .pdf)
The error seems to me to be that the file is viewed as open - "exception in line 56 of function top level, script Batch:exec Raise error: the file may be read only ...." or the pronmbr variable is not changing with the iteration through the selected files, so it thinks it is trying to save the exact same name again - maybe??? I'm at a loss.
Copy link to clipboard
Copied
You can use app.alert to write the file name to the console and see what is going on as the script runs.
Copy link to clipboard
Copied
Thanks, just tried this and app.alert brings up each filename correctly, I click OK and then I get the error message with no file save.
Copy link to clipboard
Copied
Copy the actual file name that you see in the alert (or output it to the console, and then copy it from there) into your saveAs command and run it manually from the console. Does it work?
If a file with the same name exists it will simply be overwritten. However, if that file is open, locked or is set as read-only it will fail and an error message will appear.
Copy link to clipboard
Copied
this.saveAs("/O/1_invoice staging/" + "105119 ART INV 8-02" + ".pdf")
does not work in console.
Target folder does not have original files that are being batched or any other files. Right now I can not get the console to repeat the test -- this.saveAs("/O/1_invoice staging/" + "test" + ".pdf") -- which worked yesterday. I am beginning to think console mode is unstable or I am not doing something right.
CORRECTION TO THE ABOVE: both examples of script ran and saved as expected. My console was locked up, closed and reopened Acrobat and now these commands work fine.
Copy link to clipboard
Copied
I think I am on to something:
When I run this in console it saves just fine:
var pronmbr = 105882
var date_replace = "8-01";
var filename = pronmbr + " ART INV " + date_replace + ".pdf";
console.println(filename);
app.alert(filename, 3);
this.saveAs("/O/1_invoice staging/" + filename)
when I run this in console app.alert shows the right filename but it throws the error: (does the getPageNthWord command hold onto the document in such a way as to make the saveas think protected or read-only?)
var pronmbr = getPageNthWord(0,13,false)
var date_replace = "8-01";
var filename = pronmbr + " ART INV " + date_replace + ".pdf";
console.println(filename);
app.alert(filename, 3);
this.saveAs("/O/1_invoice staging/" + filename)
Copy link to clipboard
Copied
Why are you specifying the last third parameter of getPageNthWord as false?
That means it's not stripping any white-space characters from it, which
could mean you're including something like a line-break in the file-name,
which is not allowed.
Try printing out the filename like this:
console.println(filename.toSource());
This will help you find any unwanted characters that might be hiding in
it...
Copy link to clipboard
Copied
yes!!!
(new String("105882 \n ART INV 8-01.pdf"))
got a pesky \n in the filename. So, change attribute to "true" and this will work?
Copy link to clipboard
Copied
Either that or make sure to remove any such characters from the string before using it in the file-name.
Copy link to clipboard
Copied
Just tested and retested this. It works perfectly now. thank you very much for your help!!
true/false in Excel vlookup and other attributes is just the opposite.
Copy link to clipboard
Copied
Well, the name of that parameter is bStrip. So if you specify it as true the white-space characters are stripped. If you specify it as false, they are retained... This is all documented in the Acrobat JS API Reference. Anyway, glad to hear you were able to sort it out!


-
- 1
- 2