Skip to main content
Known Participant
March 2, 2020
Answered

Unable to recognize text in document and all text copied is unreadable characters

  • March 2, 2020
  • 2 replies
  • 6022 views

I have a document that is an original PDF, i.e. not scanned.  The RECOGNIZE TEXT function will run, and takes its sweet time running, but when it is finished the text is not recognized.

 

Is there anything I can do here other than print the document, scan the print job, and OCR the resulting scan?

 

If I attempt to copy and paste any text from the document, I get a result like this:

 

ÂÜâ÷ØÃêöØãÛÚÂÄãääêãÅéäêãÆäâêÇÂâãÈ÷ÂÉãÂÇÃÄÊØËÅ›ÌÍÆÎàÇÈÉÅ~ÏÉñÇ‹ÑÊòËÒÂñÌÍÍóÓÎÛÔãÂØôÍôËØÉÂôÄÄèÕÍØÖÎöÔÛãÕÃ.Å~äÂÜØËÂÙääõêßåâö÷ÙÉéæâøØàäøêÍéâÍöíÎØÂèÛÍÚéÖÛãÂÌÂáêfiÂ÷÷ÂöãôÃÂÙØäÛÂôÂõÜäöÚã÷ØöâãøÂØ÷Âøãá÷éÂãíöÂÃãØÃØÛÂØÂÛÂéØÑãÛÂØ÷ÍÂé÷‹ÖãáÂÂßöãÖØÃàçØÉéÂá›äöÑØàÒ.Å~ÔØÎñêòÍãÂÛÖÂÌË ëññóóÑÛÛãã娨ÑôôÛíÕÂíãÌÂÃÍÅêÉØÆÂâ›ÍfiÂËØàÑÙÅ~.Å~ÔñÌÅßòÑÆÂÖñÇÉÚÄóÛÛÂÕãÇØÖßôàÕÖàÂéßÍÖÎéàÚÍÉÙìÖäãÌíóÉíÇßÂêÒÊÍËÛ‹ÌÛÍËéÎfiÂöÂÚÂØÃêÙäâû...Å~.......ÅÄ..âä÷ÚöÛêÚíôÂÂ÷ÚíáêÂÂÄøãÃÂ›ÅØÂÆõ÷Çöö÷ãÉâáâÚØÑâøÒØéÔÂöàÎØÙÛÍØÂÖâöØÌãËöÜ÷íÙÂêãÚÃ........Å~.......âÂø÷äÛÂáÛãÂöéãÚÃÜãØØä÷ÂöÛêØÂøØÛí êääãê÷fiÂÂããÅØÃáÂøØÚÂäÂ÷ö›ÂêØÂêØãôÃãäØöèÂØâØã÷óöÂåíèÂØÚØöêÙÂäóÜØÂÚåÚâØêØÛøfiÂÂÂãÚàÃÙÚÚÙÂãâÜÂÃãöÃÂØÄØåöäÍØ÷ÎÂäéÛãÛÂÙÚÍóÂâÂÂÜä÷ÛéêÂèèáÙÙ÷ääâéÛÃãêÂØèøøØÂèÂãõîóöÂØã÷ÃØâØêØÂÂøõãÃéöØ÷öØÂâöÛØØÂøçäêééÂöäÜöØÙØÛÚôÂâÚØêØÂêøéãÂÜÛÚÂÂêãÛ÷óØÂÂããÚÂÃáô÷ØöÂØØãêÃáøáÂäØôêâÂãØãäÃêåØãØÛÂÂÂøôÚãÚØøÂØÂã÷ ..›õºöà÷âõÅ~..öøûéºö›Ø».Å~.ÁÚõóêÂñøºÂ÷ãÃ÷ÙøØ.äÂöêÁØâºçÛéí.ÂäÅ.öêØâô÷fiÂØØöÙêÄã.ÛÁÂÅ÷ºÆáÂ.ÇÚ}êëóìøÂãÆöÁÚÏ÷øÉäò.ê.âÑÂÅ~ÜÓóÚßÖöÀãàê¿ËØÚöÛùÂÂ÷ÚöãÂÂöïÒéѺöÎäÛ»åø.Å~äßâÕãÖä÷ÒêÍíóÂãÃÎØÂÍ‹ËØßÖôÒèôÑØöàÂÛfiÃÂÂ÷éÙøÂâ÷êãÚâãÂ

 

This topic has been closed for replies.
Correct answer Bernd Alheit

Export the pages as TIFF files

Combine the TIFF files as a new PDF file

Run OCR on this new file

 

2 replies

Meenakshi_Negi
Legend
March 2, 2020

Hi Morganw,

 

Please try the workaround as Bernd_Alheit mentioned. Check if that helps.

If you still experience the issue, would you mind sharing the original PDF with us? We would like to check the file on our end.

You can share the PDF using the steps provided here https://forums.adobe.com/docs/DOC-7161

 

Let us know if you need any help.

 

Regards,

Meenakshi

Known Participant
March 2, 2020

I am currently in the process of confirming that this worked.

 

Unfortunately the original PDF is a proprietary corporate document belonging to an adversary in a legal proceeding and I'm guessing that has something to do with why we received the document in this barely-usable format.

 

Thus I can't provide the document as requested.

Bernd Alheit
Community Expert
Bernd AlheitCommunity ExpertCorrect answer
Community Expert
March 2, 2020

Export the pages as TIFF files

Combine the TIFF files as a new PDF file

Run OCR on this new file

 

Known Participant
March 2, 2020

This produces the desired result although I'm still not clear on why the original file could not be OCR'd.  Thank you!

Bernd Alheit
Community Expert
Community Expert
March 2, 2020

May be the document was already OCR'd.