Extract text with Vb.net

Forum|Forum|8 years ago
October 19, 2017
0 replies
800 views

Hello. I have attempted many ways on my own to extract text directly from a pdf but the issue I am having is this.

I have a large pdf file that is a combination of many forms totaling 900 pages. Essentially what I am doing is breaking that file every 5 pages to create 180 smaller files and that works fine. The challenge I have is that in order to name the file once extracted; I need the last name, first name and middle initial from page 1. Unfortunately the code I have which works very well for the first file; doesn't for any subsequent files. It seems the x, y coordinates of the fields in question vary from one file to the next. There is no rhyme or reason to the coordinates that I can see so I cannot even create an algorithm for them.

Is there a way to remove all formatting and images and everything from a PDF so I can guarantee my get is fetching the right text from a consistent (x,y) coordinate?

Developers

This topic has been closed for replies.

Remix with Firefly Community Gallery

Thousands of free creations to fall in love with and remix in Firefly.

Explore now

Sign up

To post, reply, or follow discussions, please sign in with your Adobe ID.

Sign in to Adobe Community

To post, reply, or follow discussions, please sign in with your Adobe ID.