Skip to main content
nhatquy
Participant
December 13, 2017
Question

collecting data form some scan files

  • December 13, 2017
  • 3 replies
  • 1109 views

Hi everybody,

I have some scan files that a same form,

Now i want to collect some information from those files,

How can i do that ?

Thanks all,

This topic has been closed for replies.

3 replies

nhatquy
nhatquyAuthor
Participant
December 14, 2017

i can convert those files into ORC, excel, word,... i can do it,

but it means:

i have 1, 2, 3,...n scan files with a same form, now i want to collect some information in those scan files into a excel table,

View following example images i attached:

try67
Community Expert
Community Expert
December 14, 2017

Once you convert the files to another format it's no longer related to PDF and not the topic of this forum.

If you run OCR on your PDF files and you have Adobe Acrobat Pro then it might be possible to do it in Acrobat.

If you wish you can send me some sample files (to: try6767 at gmail.com) and I'll let you know if I think it's doable or not.

Thom Parker
Community Expert
Community Expert
December 14, 2017

Once the scanned file is OCR'd you can use the text selection tool to copy and paste text.

Thom Parker - Software Developer at PDFScriptingUse the Acrobat JavaScript Reference early and often
nhatquy
nhatquyAuthor
Participant
December 18, 2017

No,

with n files after convert is n files OCR,  it is take so many times to copy/paste,

Thom Parker
Community Expert
Community Expert
December 18, 2017

You can automate data extraction with Acrobat JavaScript, but this is an advanced programming task. There are also several 3rd party tools for this.

If the data position is well defined, then it is not a particularly difficult task in JavaScript. But if the data position cannot be predicted, it may be impossible to  automate.

Contact me if you'd like a custom script written.

Thom Parker - Software Developer at PDFScriptingUse the Acrobat JavaScript Reference early and often
try67
Community Expert
Community Expert
December 13, 2017

First step is to run Text Recognition on them. Until you do that there's no

information to collect, as they are just images.