Skip to main content
rombanks
Inspiring
February 24, 2016
Answered

Identifying blank body on a page

  • February 24, 2016
  • 1 reply
  • 2876 views

Hello fellows,

I wonder if it's possible to check if a page contains text in the body area. As far as I can see, Javascript cannot discern between headers/footers/body area. Is this true?

Thank you for your response in advance!

This topic has been closed for replies.
Correct answer Test Screen Name

Hi guys,

Thank you for your response!

What I am trying to do is filtering out those pages that contain the word "Part". The script is not supposed to run the action on these pages.

As you said, testing for the presence of the word "Part" is not a good solution as the action is applied when other words are detected. I guess, the solution is creating an array of all the words that are present on the page and checking if it contains the word "Part". Am i right?

Thanks!


You can do that, but doing an intermediate step like copying to an array is just more overhead in an already slow task.

In pseudocode,

set a flag variable to 0

if there are less than 42 words,

  step through each word

    if the word starts Part (or whatever) set the flag to 1

Now, when the loop is finished, if flag is 1, do your action.

1 reply

try67
Community Expert
Community Expert
February 24, 2016

It is possible, but it's not easy. You can use the getPageNthWordQuads method to get the exact location of each word in the page. Then you need to compare it to the area you're interested in and see if they overlap. If no words match this area then you can conclude that there's no text in it.

rombanks
rombanksAuthor
Inspiring
February 24, 2016

Hi try67,

Thank you for your prompt response! I checked the definition of this method and I don't see how it can be used in this case.

The method params are 0-based indices - how can they help me identify location on a page?

In addition, if you would like to test if a specific word (text string) is present on a page, how would you do that?

Thanks again!

try67
Community Expert
Community Expert
February 24, 2016

As I said, it's a complex task. You'll basically need to iterate over all of the words in all of the pages to be able to determine it (although you can stop as soon as a match is made, since then you know that there's text in that area).

You can search for a specific word using a similar method and the getPageNthWord method.