Find a sequence of numbers in a pdf

New Here ,
Jun 18, 2018

Copy link to clipboard

Copied

Hello.

I want to be able to find a sequence of ten numbers in a pdf. I was thinking of reading through each character of the pdf and using an if statement with the isNaN function, however, I'm not sure if that's even possible because I don't know how to read each character in a pdf. If there is another method on how to do this, then that would be fine, too.

Any help or guidance would be appreciated.

Thank you.

Most Valuable Participant
Correct answer by try67 | Most Valuable Participant

Use a loop to iterate over all the words in all the pages in the file (using the getPageNthWord and the getPageNumWords methods of the Document object), and then a simple Regular Expression to check if it's a 10-digit number. Something like this:

loop1:

for (var p=0; p<this.numPages; p++) {

    var numWords = this.getPageNumWords(p);

    for (var i=0; i<numWords; i++) {

        var word = this.getPageNthWord(p,i,true);

        if (/^\d{10}$/.test(word)) {

            console.println("Found it: " + word);

            break loop1;

        }

    }

}

TOPICS
Acrobat SDK and JavaScript

Views

194

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more

Find a sequence of numbers in a pdf

New Here ,
Jun 18, 2018

Copy link to clipboard

Copied

Hello.

I want to be able to find a sequence of ten numbers in a pdf. I was thinking of reading through each character of the pdf and using an if statement with the isNaN function, however, I'm not sure if that's even possible because I don't know how to read each character in a pdf. If there is another method on how to do this, then that would be fine, too.

Any help or guidance would be appreciated.

Thank you.

Most Valuable Participant
Correct answer by try67 | Most Valuable Participant

Use a loop to iterate over all the words in all the pages in the file (using the getPageNthWord and the getPageNumWords methods of the Document object), and then a simple Regular Expression to check if it's a 10-digit number. Something like this:

loop1:

for (var p=0; p<this.numPages; p++) {

    var numWords = this.getPageNumWords(p);

    for (var i=0; i<numWords; i++) {

        var word = this.getPageNthWord(p,i,true);

        if (/^\d{10}$/.test(word)) {

            console.println("Found it: " + word);

            break loop1;

        }

    }

}

TOPICS
Acrobat SDK and JavaScript

Views

195

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Jun 18, 2018 0
Engaged ,
Jun 18, 2018

Copy link to clipboard

Copied

You wish to find ANY sequence of 10 numbers or a particular sequence?

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Jun 18, 2018 0
Most Valuable Participant ,
Jun 18, 2018

Copy link to clipboard

Copied

Also, are these numbers joined to a single "word", or are they separated by

spaces/commas/hyphens/etc.?

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Jun 18, 2018 0
New Here ,
Jun 18, 2018

Copy link to clipboard

Copied

It's formatted like...

1234567891

And I want to get the sequence as it is in the pdf.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Jun 18, 2018 0
Most Valuable Participant ,
Jun 18, 2018

Copy link to clipboard

Copied

Use a loop to iterate over all the words in all the pages in the file (using the getPageNthWord and the getPageNumWords methods of the Document object), and then a simple Regular Expression to check if it's a 10-digit number. Something like this:

loop1:

for (var p=0; p<this.numPages; p++) {

    var numWords = this.getPageNumWords(p);

    for (var i=0; i<numWords; i++) {

        var word = this.getPageNthWord(p,i,true);

        if (/^\d{10}$/.test(word)) {

            console.println("Found it: " + word);

            break loop1;

        }

    }

}

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Jun 18, 2018 0