Skip to main content
Known Participant
November 7, 2023
Question

How to get exact data which matches regular expression

  • November 7, 2023
  • 2 replies
  • 1570 views

In pdf ,

 

I want to Create hyperlink on all numbers with dot (.) (1. , 2. , etc..)

 

This number will be in the starting of paragraph which has destination with same number with #

 

Note destination number with # can be on different page 

 

+++++++

 

Below is the code..

 

// Iterate through all pages of the PDF

 

for (var pageNum = 0; pageNum < this.numPages; pageNum++) {

 

  var page = this.getPageNum(pageNum);

 

  var pageText = page.extractText();

 

  // Use a regular expression to find numbers with dots at the beginning of paragraphs

 

  var numberRegex = /^\d+\./gm;

 

  var match;

 

  while ((match = numberRegex.exec(pageText)) !== null) {

 

    // Extract the matched number with dot

 

    var numberWithDot = match[0];

 

   /////////// Hyperlink and hyperlink destination code will come here .....

 

  }

 

}

 

 

++++++

Error

 

this.getPageNum(pageNum) is not a function

 

tried alternative, 

 

this.getNthPage(pageNum)

 

this.getNthPage(pageNum) is not a function

 

How do I exactly get the content of the page and match it with regex. Please help.

 

In the indesign script there is a doc.findGrep() function which makes life easier. How do I achieve this in PDF?

 

Thanks for the support.

 

This topic has been closed for replies.

2 replies

Hetal5C4CAuthor
Known Participant
November 7, 2023

Thanks @APK33396881raev .

 

As already mentioned in the post,

It throws me an error as below 

 

this.getPageNum(pageNum) is not a function

Also tried,

this.getNthPage(pageNum) is not a function

 

Is there any alternative to get exact content which matches regex from page..

 

I want content from whole file which matches regex, which I can loop through and not necessarily from page by page.

 

Please suggest.

Bernd Alheit
Community Expert
Community Expert
November 7, 2023

getPageNum and extractText doesn't exist in Acrobat Javascript.

Hetal5C4CAuthor
Known Participant
November 7, 2023

Thank you @Bernd Alheit 

 

Yes , while using it getting error 

this.getPageNum(pageNum) is not a function

 

Also tried,

 

this.getNthPage(pageNum) is not a function

 

Is there any alternative to get exact content which matches regex from page..

 

I want content from whole file which matches regex, which I can loop through and not necessarily from page by page.

 

Please suggest.

Bernd Alheit
Community Expert
Community Expert
November 7, 2023

You must create loops over all words in the file.

Read the documentation in the Acrobat Javascript Reference.