Skip to main content
golfer8866
Participant
July 18, 2018
Question

Extracting text from pdf

  • July 18, 2018
  • 2 replies
  • 438 views

Can specific text be extracted from a pdf file?

I have pdf's that have pictures, text, tables and just lines of text in them. The pictures are identified with a g-number, I would like to find a way to extract out all the g-numbers and put them in excel.

Also there is another data set I would like to have extracted as well. But I figure if I can get one, the other should be similar.

Thanks

This topic has been closed for replies.

2 replies

Legend
July 23, 2018

Ok, if you want to code in JavaScript you'll need the Acrobat SDK. The methods to research are document.getPageNthWord and getPathNthWordQuads.

golfer8866
Participant
July 23, 2018

Is there a better forum to post my question in?

Legend
July 23, 2018

That depends. Are you looking to write a JavaScript program to extract the text (which will come one word at a time)?

golfer8866
Participant
July 23, 2018

Yes, what ever the best process would be to pull out and list all the g-numbers.