Memory leak (which leads to out of memory) when doing OCR
- June 8, 2020
- 1 reply
- 6251 views
Environment: Acrobat Pro DC version 2020.009.20063. Windows 10 version 1909 (18363.778). Total 16GB of physical RAM. Locale zh/cn.
I used Acrobat Pro DC to do an OCR for a scanned textbook (~500 pages, 250MB). When it progressed to ~Page 90, following 3 consequtive error dialogs were shown, and then the recognition stopped:
- "unknown error"
- "unable to locate the paper capture recognition service"
- "out of memory"
(Text may be inaccurate because the original dialog is shown in Chinese. See attached screenshots for original text.)
Task Manager showed that it used 3.5GB of RAM, the maximum value for a 32-bit program.
I found a workaround for this problem, that is to recognize only 80 pages, save result and then restart Acrobat before it runs out of memory. However this is very time-consuming so I hope it will be fixed.
I have attached sample.pdf (repeated 120 pages of TOC from the book) to reproduce this issue. Use OCR option "Chinese (simplified)", "searchable image", "300 dpi", and it will OOM at Page 91.
