Acrobat Javascript extracting footnotes technique
Hi,
For PDFs that have been converted from Word files, I'm investigating how I can extract footnotes. One approach I'd like to validate the possibility of is writing a search that looks for footnotes in the text (understanding that "text" is not a straightfoward concept in a PDF). I've been looking at:
- ADOBE PDF LIBRARY SDK
- Acrobat DC SDK
for scripting options.
I'm wondering if I could first do a search for a number - e.g. 1 and then either determine the rectangle shape and relative offset to determine if it's a footnote reference; or if text properties are available, the superscript property (if there is one). If it finds a footnote reference, follow on to find the actual footnote content at the bottom of the page and extract that.
Thanks
