Skip to main content
Participant
October 14, 2024
Question

VBA code to loop thru text (objects? labels?)

  • October 14, 2024
  • 1 reply
  • 295 views

Has anyone written code to loop thru text (objects/labels? - NOT form fields) to retrieve blocks of text?  Looping thru all words is a bit of a kludge - I have code to do this.  Acrobat API must have a way to track text objects/labels in a document.  Just can't find the right object to reference.

This topic has been closed for replies.

1 reply

Thom Parker
Community Expert
Community Expert
October 15, 2024

From VBA, the best (most efficient/robust) way to search (raw) page text is to write a JavaScript function to do it and then call it from VBA.   Neither VBA or JavaScript provide any information about the type of text object. Nor is there any way to detect graphics that appear on the page.  Getting the level of detail you want requires a plug-in.

 

Page text in a PDF is placed on the page using coordinates. There is no guarentee that the words on a page appear in the same order in which they are listed in the page content. And there is no required information on what kind of text object (heading, paragragh, table, etc.), the text is part of.

 

That said, the text on a page can be Tagged, which is a way of marking text as a kind of text object. Many converstion apps include tagging because the text organization is already known in the app. For example MS Word conversion can add tags. Acrobat also has a feature for auto-tagging PDFs. Although it doesn't always get things right. 

 

In Acrobat, Tags can only be searched with a plug-in. 

 

 

Thom Parker - Software Developer at PDFScriptingUse the Acrobat JavaScript Reference early and often
GeoGlickAuthor
Participant
October 15, 2024

Thanks Thom for the info.  I think our team is going a different route (creating a PDF form).