Copy link to clipboard
Copied
Can specific text be extracted from a pdf file?
I have pdf's that have pictures, text, tables and just lines of text in them. The pictures are identified with a g-number, I would like to find a way to extract out all the g-numbers and put them in excel.
Also there is another data set I would like to have extracted as well. But I figure if I can get one, the other should be similar.
Thanks
Copy link to clipboard
Copied
Is there a better forum to post my question in?
Copy link to clipboard
Copied
That depends. Are you looking to write a JavaScript program to extract the text (which will come one word at a time)?
Copy link to clipboard
Copied
Yes, what ever the best process would be to pull out and list all the g-numbers.
Copy link to clipboard
Copied
Ok, if you want to code in JavaScript you'll need the Acrobat SDK. The methods to research are document.getPageNthWord and getPathNthWordQuads.