Copy link to clipboard
Copied
Hello.
I want to be able to find a sequence of ten numbers in a pdf. I was thinking of reading through each character of the pdf and using an if statement with the isNaN function, however, I'm not sure if that's even possible because I don't know how to read each character in a pdf. If there is another method on how to do this, then that would be fine, too.
Any help or guidance would be appreciated.
Thank you.
Copy link to clipboard
Copied
You wish to find ANY sequence of 10 numbers or a particular sequence?
Copy link to clipboard
Copied
Also, are these numbers joined to a single "word", or are they separated by
spaces/commas/hyphens/etc.?
Copy link to clipboard
Copied
It's formatted like...
1234567891
And I want to get the sequence as it is in the pdf.
Copy link to clipboard
Copied
Use a loop to iterate over all the words in all the pages in the file (using the getPageNthWord and the getPageNumWords methods of the Document object), and then a simple Regular Expression to check if it's a 10-digit number. Something like this:
loop1:
for (var p=0; p<this.numPages; p++) {
var numWords = this.getPageNumWords(p);
for (var i=0; i<numWords; i++) {
var word = this.getPageNthWord(p,i,true);
if (/^\d{10}$/.test(word)) {
console.println("Found it: " + word);
break loop1;
}
}
}