• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Extracting text from pdf

New Here ,
Jul 18, 2018 Jul 18, 2018

Copy link to clipboard

Copied

Can specific text be extracted from a pdf file?

I have pdf's that have pictures, text, tables and just lines of text in them. The pictures are identified with a g-number, I would like to find a way to extract out all the g-numbers and put them in excel.

Also there is another data set I would like to have extracted as well. But I figure if I can get one, the other should be similar.

Thanks

TOPICS
Acrobat SDK and JavaScript

Views

278

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jul 23, 2018 Jul 23, 2018

Copy link to clipboard

Copied

Is there a better forum to post my question in?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Jul 23, 2018 Jul 23, 2018

Copy link to clipboard

Copied

That depends. Are you looking to write a JavaScript program to extract the text (which will come one word at a time)?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jul 23, 2018 Jul 23, 2018

Copy link to clipboard

Copied

Yes, what ever the best process would be to pull out and list all the g-numbers.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Jul 23, 2018 Jul 23, 2018

Copy link to clipboard

Copied

LATEST

Ok, if you want to code in JavaScript you'll need the Acrobat SDK. The methods to research are document.getPageNthWord and getPathNthWordQuads.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines