Participant

Question

Acrobat 9 crashes on OCR

Forum|Forum|17 years ago
October 7, 2008
31 replies
24844 views

I've been trying to convert a batch of large PDF files to PDF searchable files by using the OCR of Acrobat. In the middle of a batch, a large (1000+ page) document crashes acrobat. I have narrowed it down to this image:

http://img90.imageshack.us/img90/2418/badke2.png
59,520 bytes

When I convert it to PDF (File->Create PDF->From Single File) and then use Acrobat to "Document->OCR Text Recognize->Recognize Text using OCR", Acrobat always crashes.

Is this true for anyone else that could try it?

It kills my batch processing and is making this large conversion quite painful. Is there a way around it?

Scan documents and OCR

This topic has been closed for replies.

G

GustavoSainz

Participant

I'm having Acrobat 9.5.0.270 crashes on some pdfs, when using batch processing, fast web view feature, with OCR and PDF optimizer, monochrome images were set to JBig2 lossless.

Faulting application name: Acrobat.exe, version: 9.5.0.270, time stamp: 0x4f03f71d

Faulting module name: OCRLibraryInf.dll, version: 9.5.0.270, time stamp: 0x4f03e982

Exception code: 0xc0000094

Fault offset: 0x00093aeb

Faulting process id: 0x14fc

Faulting application start time: 0x01cd02e462799896

Faulting application path: C:\Program Files (x86)\Adobe\Acrobat 9.0\Acrobat\Acrobat.exe

Faulting module path: C:\Program Files (x86)\Adobe\Acrobat 9.0\Acrobat\plug_ins\PaperCapture\OCRLibraryInf.dll

Report Id: 0dbea415-6f62-11e1-a8e9-005056ba0009

A

Anonymous

MacBook Pro (1997)

- Mac OS X 10.7.2

- 2.6GHz Core 2 Duo

- 4GB RAM

Acrobat 9 Pro

- version 9.4.6

Acrobat 9 Pro OCR always crashes when using ClearScan but not when using "Searchable Image" or "Searchable Image (Exact)." I scanned several journal pages at 300 dpi (color, grayscale, bitmap) in .tiff and .png as well as screen selecting text from a browser. The results were consistent across all variations.

The last time I used the Acrobat's OCR function was last Summer before upgrading from Snow Leopard to Lion. Under Snow Leopard, Acrobat did not crash during OCR (it did crash, just not while processing text for OCR). I did not attempt Acrobat OCR under Lion 10.7 or 10.7.1.

Repeatable test.

1. Open wikipedia "Crash (Computing)" page

http://en.wikipedia.org/wiki/Crash_(computing)

2. Enlarge text size, if desired.

I tried several text sizes from the default to much, much larger. Text size has no impact on the results.

3. Create a PDF

File >> Create PDF >> From Selection Capture

I selected the first paragraph:

A crash (or system crash) in computing is a condition where a computer or a program, either an application or part of the operating system, ceases to function properly, often exiting after encountering errors. Often the offending program may appear to freeze or hang until a crash reporting service documents details of the crash. If the program is a critical part of the operating system kernel, the entire computer may crash. This is different from a hang or freeze where the application or OS continues to run without obvious response to input.

4. OCR

Document >> OCR Text Recognition >> Recognize Text Using OCR

4.1 Searchable Image (Exact)

Primary OCR Language: English (US)

PDF Output Style: Searchable Image (Exact)

Downsample: None

Result: No crash — OCR successful

4.2 Searchable Image (tested for each downsample option)

Primary OCR Language: English (US)

PDF Output Style: Searchable Image

- Downsample: Lowest (600 dpi)

- Downsample: Low (300 dpi)

- Downsample: Medium (150 dpi)

- Downsample: High (72 dpi)

Result: No crash — OCR successful

4.3 ClearScan (tested for each downsample option)

Primary OCR Language: English (US)

PDF Output Style: ClearScan

- Downsample: Lowest (600 dpi)

- Downsample: Low (300 dpi)

- Downsample: Medium (150 dpi)

- Downsample: High (72 dpi)

Result: Crash — OCR not successful

failed_spirit

Participant

I'm having the same problem. ClearScan crashes, but "searchable image" does not. Happily, I can export text from "searchable image" format.

MacBook Pro (2008) 2.4 GHz Core 2 Duo (surely Kelly means 2007, not 1997)

4 GB Ram

OSX 10.7.2 (Lion)

Acrobat version 9.4.6

A

Anonymous

failed_spirit wrote: "surely Kelly means 2007, not 1997"

You are correct. 2007.

A

Anonymous

I've got 9.4, and I too am having the crash to desktop problem when trying to OCR the pages, and usually with amounts of 50 or more pages, and randomly. Very frustrating due to the amount of time involved in scanning, waiting for the program to OCR, and then seeing that dreaded error screen, with no apparent explanation. Not every time though, praise the Lord. So far I have not seen anyone answer this problem decisively in this thread. Does anyone know a cecisive answer? Thanks, Bill W.

B

Bill12

Inspiring

I have not seen the problem with AA 9.4. What you might try is OCR on a portion of the document. Save that result and then try the next part of the document. The key may be so many pages. It used to be that Acrobat had a 50 page limit, but that is gone. Still it might be a size issue. You could also try clearing your TEMP folder and see if that helps.

With all such experiments, do work on a copy.

A

Anonymous

Thanks very much Bill. I’ll give these ideas a try. Bill W

Law Offices of William N. Woodson, III, APC

1717 Hillside Drive

Fallbrook, CA 92028

(760) 535-6645

FAX (760) 451-1777

wwood44299@aol.com

wnwoodson3@gmail.com

A

Anonymous

I am having this problem also, running ver. 9.4

O

oxident

Participant

Adobe released the Acrobat 9.1 update yesterday. Unfortunately, the online updater doesn't find it so you have to download it manually from Adobe's website.

As far as I can tell the OCR engine is much more stable now! No more crashes so far...

G

Gerry_Hein

Participant

Version 3 was awesome, simple and stable. Came with an office scanner.
Version 5.05 was great, everything in one package, before Acrobat split standard and pro.
Version 6.X was a nightmare with bloated PDFs and snails performance. Luckily I just had to help others who had it.
Version 7.1 Pro was/is easy and stable.
Version 8.x I have no personal experience with.
Version 9 Pro Extended trial has some really nice features like one-step watermark removal, but have not bought the upgrade yet. Its good to hear other people's trials and tribulations, even though you would hardly ever have positive posts, thanking Adobe for a great product.

O

oxident

Participant

So I guess the latest stable version for doing OCR is Acrobat 7, isn't it?

O

oxident

Participant

I'm also having problems with Acrobat crashing randomly when OCRing large scanned documents (>100MB). These problems began with Acrobat 8 and are still there in AA9!

What kind of fix do you mean Adobe has applied to AA9? Since installing the boxed version, Adobe's Update application never found any updates for AA9 :-(

_

_Terry_Smythe_

Participant

Thank you for your suggestion. I did try that, but discovered that AA8 would sometimes crash on an earlier page, forcing restart again at the beginning. I concluded that as AA8 does everything in memory, that these 600meg PDF files were simply too big. And even on a swift 3.0Ghz dual-core system with 3.5 gigs of memory, it still took a huge amount of time.

Curiously, If I took note of the offending page where it crashed, AA8 would OCR process the 10 pages embracing the offending page, quite normally, if I sent it to process just those pages.

As a consequence, rather than trust it to crash at same spot, I elected to break the files in half, then OCR process 100 pages at a time, saving the file at conclusion of each group. It might still crash occasionally, but at least I did not have to repeat what had already been done successfully.

Knitting the broken files together after successful OCR processing is really quite trivial, done in seconds, not a hardship.

But how nice it would be if the fix applied to Version 9 would also be applied to version 8.1.3. As a volunteer, I can't afford version 9, and my version 8.1.3 otherwise does the job, albeit with aggravation.

When the next big group of similar PDF files emerge, I won't waste so much time experimenting. I'll just repeat this process from the beginning.

Regards, and thank you for thinking of me, appreciated.

Terry Smythe

_

_vjw_

Participant

Rather than breaking the PDF into smaller files, run OCR until it crashes, note the page, then run OCR up to the page before the crash. Then run OCR again starting the page AFTER the crash.

At least you'll have the PDF doc in one piece, even though certain pages in the doc won't have been OCR'd.

Then you can insert re-scanned pages into the appropriate spot.

This was a workaround that worked for me.

Show more replies

Sign up

To post, reply, or follow discussions, please sign in with your Adobe ID.

Sign in to Adobe Community

To post, reply, or follow discussions, please sign in with your Adobe ID.

Scanning file for viruses.

This file cannot be downloaded