Skip to main content
Participating Frequently
June 18, 2018
Answered

Find a sequence of numbers in a pdf

  • June 18, 2018
  • 1 reply
  • 1133 views

Hello.

I want to be able to find a sequence of ten numbers in a pdf. I was thinking of reading through each character of the pdf and using an if statement with the isNaN function, however, I'm not sure if that's even possible because I don't know how to read each character in a pdf. If there is another method on how to do this, then that would be fine, too.

Any help or guidance would be appreciated.

Thank you.

This topic has been closed for replies.
Correct answer try67

It's formatted like...

1234567891

And I want to get the sequence as it is in the pdf.


Use a loop to iterate over all the words in all the pages in the file (using the getPageNthWord and the getPageNumWords methods of the Document object), and then a simple Regular Expression to check if it's a 10-digit number. Something like this:

loop1:

for (var p=0; p<this.numPages; p++) {

    var numWords = this.getPageNumWords(p);

    for (var i=0; i<numWords; i++) {

        var word = this.getPageNthWord(p,i,true);

        if (/^\d{10}$/.test(word)) {

            console.println("Found it: " + word);

            break loop1;

        }

    }

}

1 reply

Inspiring
June 18, 2018

You wish to find ANY sequence of 10 numbers or a particular sequence?

try67
Community Expert
Community Expert
June 18, 2018

Also, are these numbers joined to a single "word", or are they separated by

spaces/commas/hyphens/etc.?

Participating Frequently
June 18, 2018

It's formatted like...

1234567891

And I want to get the sequence as it is in the pdf.