Highlighted

Best way to OCR a large file without crashing

Community Beginner ,
Jan 08, 2019

Copy link to clipboard

Copied

Hello,

I am looking for the most efficient way to OCR large sized documents (without splitting the file into smaller components) without Adobe crashing/freezing

By large size, I mean a few thousand pages (nature of my work)

I am aware of the Optimize PDF function, but am not if that will help (or which way to adjust the settings).

Note that I am using 600 dpi for downsampling

I am using Adobe Pro DC 2015.017.20050

Thanks

TOPICS
Scan documents and OCR, Windows

Views

1.8K

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more

Best way to OCR a large file without crashing

Community Beginner ,
Jan 08, 2019

Copy link to clipboard

Copied

Hello,

I am looking for the most efficient way to OCR large sized documents (without splitting the file into smaller components) without Adobe crashing/freezing

By large size, I mean a few thousand pages (nature of my work)

I am aware of the Optimize PDF function, but am not if that will help (or which way to adjust the settings).

Note that I am using 600 dpi for downsampling

I am using Adobe Pro DC 2015.017.20050

Thanks

TOPICS
Scan documents and OCR, Windows

Views

1.8K

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Jan 08, 2019 1
Adobe Community Professional ,
Jan 08, 2019

Copy link to clipboard

Copied

Hi Chris,

What I have done in those kinds of situations is to split it in smaller sized documents (say) 200 pages and then combine then back into one once all are completed. Yes tedious, but more likely successful.

I do not know a Maximum recommended pages and I've not seen any mention of this by Adobe but a couple of hundred have been "safe" in my experience.

Let me add that you are wise and correct to scan at that resolution, the OCR quality goes up considerably as the resolution increases.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Jan 08, 2019 0
Community Beginner ,
Jan 08, 2019

Copy link to clipboard

Copied

Gary,

Thank you very much for the assistance. Splitting these files have been tried before. However, since it is originally one document, it would need to be put back together. The problem there is that once the file is put back together, it either still crashes upon search or is entirely too big to be emailed (again, part of my work)

I was hoping there would be a way to prevent the file size from increasing to the point of crashing Adobe.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Jan 08, 2019 0
Adobe Community Professional ,
Jan 08, 2019

Copy link to clipboard

Copied

Oh my.

Just out of curiosity, what final size document are we talking about and how many pages for that document?

I know that PDFs can range up to some 4000 pages so I'm wondering if there's something in error within that document. Is it all text? Are their images? If so, are they bitmapped images or vector images?

Were they scanned and if so how? What kind of scanner was used. What kind of format were they saved as (PDF, JPG, TIF, ??)?

Sorry for the questions, but need more info...

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Jan 08, 2019 0
Community Beginner ,
Jan 15, 2019

Copy link to clipboard

Copied

No worries. I'm just thankful for the assistance

To answer your questions, the largest files being dealt with are about 3000 pages. The pages are usually either scans of hard copy documents, inherently digital documents, or a combination of the two. So in the cases the files are scans only, I guess we are really dealing with a file with 3000 images potentially.

They were saved as PDF files

Let me know if you'd like to know anything else

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Jan 15, 2019 0
Adobe Community Professional ,
Jan 15, 2019

Copy link to clipboard

Copied

Hi Chris,

Your response led me to review this whole thread. I do not think the size of the document is the issue. My guess is that there is an error somewhere in the document that is causing the crash but you did not mention that this shows up in every large document you are dealing with or just one. That's an important distinction that should be investigated.

As far as transferring this document to others, when I have very large documents to get out, Dropbox is your friend! ;>)

Best,

Gary

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Jan 15, 2019 0
Community Beginner ,
Jan 16, 2019

Copy link to clipboard

Copied

Hi Gary,

Apologies. The crashing occurs with every file of that estimated size, not just one or two unfortunately. Sorry for leaving that out earlier.

How does this affect our analysis of the situation?

Chris

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Jan 16, 2019 0
Adobe Employee ,
Jan 16, 2019

Copy link to clipboard

Copied

Hi Chris,

Sorry for the inconvenience caused. Would you help me with the following details?

  1. Operating system name and version?
  2. Current version Acrobat installed on your affected machine.
  3. A sample file that you can reproduce the issue with.
  4. Crash logs

How to get the Crash Logs:

a. When Acrobat Crashes, Open Windows Task Manager

b. -> Got To Processes, There you can see a process "Adobe Acrobat  Pro DC"

c. Right Click on this process and click "Create Dump File"

d. Dump file will be created in the Temp folder of the user (as specified on the dialog you get after creating dump files).

e. Save this DMP file on any Cloud Storage and Share the link.

Please share the dump file and a sample file via PM message  How Do I Send Private Message  you can use Adobe Send for cloud storage How to share a file using Adobe Document Cloud

-Tariq Dar

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Jan 16, 2019 0
New Here ,
Aug 05, 2020

Copy link to clipboard

Copied

Hi there

Splitting the file is very time consuming.  There must be a better way.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Aug 05, 2020 0
Most Valuable Participant ,
Jan 16, 2019

Copy link to clipboard

Copied

It's possible of course that Adobe fixed it in a later version in the years since your product was made. They have been tinkering with OCR a bit, but I haven't seen a specific reference to fixing this bug.

However, Acrobat is a tool for low volumes in OCR, you may be better off looking for a tool which takes more seriously high volume work. Can't recommend any.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Jan 16, 2019 0
New Here ,
Aug 05, 2020

Copy link to clipboard

Copied

I have the same issue on a version of DC Pro that is just a month old.

I would very much appreciate a fix here.  Having to break down the file is not an acceptible solution for a program that costs this much.

I have this issue with a file size of 500 pages so in the big scheme of things, not even that large really.

Can someone offer a solution please?

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Aug 05, 2020 0