• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

OCR function does not put pages in the correct orientation

Explorer ,
Mar 03, 2023 Mar 03, 2023

Copy link to clipboard

Copied

I have a paid for program that says it is Adobe Acrobat Standard 2020.  I used to use Acrobat 9.  I work with documents that have a mixture of page orientations: "text" is portrait orientation and tables are landscape orientation.  While doing OCR Acrobat 9 would correctly orient all of the pages so that all pages were readable without my needing to manually rotate anything.  I cannot find any way to make this program do the same thing.  Is there a setting I  am missing?  It seems odd that the usability of this program has diminished so much over time.  It is a very time consuming endeaver to go through and fix this.  

Note: I am not the one making the original document.  I am the end user.  I am given documents that were made by other people in other organizations from mine and do not have the ability to ask them to redo their documents.  I need to efficiently use these documents to write reports, including extracting the data out of tables to use in excel (among other things).

TOPICS
How to , Scan documents and OCR

Views

1.7K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 04, 2023 Mar 04, 2023

Copy link to clipboard

Copied

Should this be in a different "topic"?  Is there anyone who can help with this?  If not, how do I get help with this?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 04, 2023 Mar 04, 2023

Copy link to clipboard

Copied

Hi Darla,

 

I'm sorry but there are a bunch of issues here. For one, Acrobat 9 is well beyond the "end of life." That is, it is so old that Adobe no longer supports it. Think of an old car where all the parts are no longer made.

 

Next, are you using the same computer OS that you were using when you first got Acrobat 9? 

 

Lastly, if you are not making the original documents, who is, and how are they making them? Are they being scanned? Assembled? What format are they in? PDF? TIFF? JPG?

 

I'm not sure anyone can help you. At least not without more information.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 04, 2023 Mar 04, 2023

Copy link to clipboard

Copied

Hi Gary.  Thank you for replying!!!  I am not using Acrobat 9.  [I wish I was.]  I am using Adobe Acrobat Standard 2020 on a Windows 11 PC.  The documents come to me in pdf format.  All different people make them, and I would guess they use different methods, depending on what is available in their workplace and what they know how to do.  Some arrive as scans (i.e., the entire document was printed then scanned).  Some come from microfiche.  Others have a part of the document as a "text pdf" while other parts, such as tables, are printed, scanned, then inserted as images.  For the one I am currently looking at, the document as a whole was originally in Word 2016, but some pages include images that are scanned tables or other documents. 

 

What I do know for certain, is that I used to be able to do OCR on an entire scanned document, and the end result would have all of the OCR'd pages properly oriented, i.e., a mixture of portrait and landscape would result, so that all of the pages were able to be read by me without having to rotate them.

I think what you are going to tell me is that Adobe Acrobat no longer will do that.  But I hope you will say something other than that.  As I said, I find it shocking how the usability of this program has diminished so much over time.  Acrobat 6 was better than 9, and now this....

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 04, 2023 Mar 04, 2023

Copy link to clipboard

Copied

Hi Darla, first off, my apologies. You did state that you were on Acrobat Standard, but my eye caught your comment about Acrobat 9 and fixated on that. Again, my apologies.

 

Otherwise, I have to say that the range of materials you are receiving is extraordinarily varied. I've never worked with a PDF generated from microfiche; I have no idea what the image's resolution is from them. In addition, taking documents, printing them, then scanning the printing and putting that into another document is a pre-planned nightmare.

 

Also, as a Mac user, I've never used Acrobat Standard; I've always used Acrobat Pro. So, unfortunately, I've no idea what OCR differences there are between the two. But I know that what you're describing would be a big challenge for Acrobat Pro.

 

Another obstacle you have is that because the documents you are receiving are undoubtedly of various resolutions, the quality of any OCR will be questionable. Let me give you a classic example: if you have the letter combination "ri," depending on the quality and resolution of the scan, plus if the original scan was saved as a highly compressed JPG, it's easy for those letters to be seen by the OCR as an "n." Thus, the word "right," will be seen as "nght." From this, if you were searching for the word right, it would not be found. 

 

Also, scans of Excel charts are notoriously challenging for OCR.

 

Suffice it to say that I believe that what you have and your needs and expectations lead me to say that this is too great a challenge for Adobe Acrobat's OCR engine. I suggest contacting other OCR companies to see if they have a trial period and see how well their software does. You might find that Acrobat does about as good as anything else out there, but then you'll know for sure.

 

I wish you the best of luck; you have a big challenge with these documents.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 04, 2023 Mar 04, 2023

Copy link to clipboard

Copied

++Adding to Gary's always valuable guidance,

 

Seems like you're unhappy with many different things.

 

 But I will stick to the main topic of your original post.

 

My question is how are you inserting those documents in the PDF document that you're editing with Acrobat 2020?

 

For example, are you combining files with the Combine Files tool?

 

Are you  using the Organize Pages tool or strictly adhering to the Scan & OCR tool to insert files? 

 

What exactly are all the methods that you're using?

 

And at which stage of these processes do you manually have to edit PDFs that are sent to you?

 

For instance, as opposed to other file import or file insertion methods in Acrobat, if you're using the Scan & OCR tool to upload scanned image files directly onto a PDF document that is viewed in Acrobat, the tool should upload that scanned image file in the correct page orientation (based off of the PDF structure of that document), and it will also stretch and fit the image based on the page dimensions already assigned in such document when it was produced.

 

In addition, the Scan & OCR tool will perform text and image recognition automatically as the file uploads onto that document( to separate the text and image content layers for further editing).

 

I get all of your frustration, but in order to assist you better, share specific step-by-step details of the tool or methods that lead to incorrect page orientation.

 

If you can share a file with no sensitive data on it, we may be able to reproduce the problem and assist you better.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 05, 2023 Mar 05, 2023

Copy link to clipboard

Copied

LATEST

 

@ls_rblsThank you so much for trying to help me.  I understand that my situation is frustrating in terms of being able to help me, and I apologize. The documents are confidential, and I cannot freely share them with unauthorized people. That would be great if I could because in terms of how to MAKE pdfs, I know very little. That's not my job.

In fact, I am not attempting to do anything like insert things, edit pdfs, or combine files.  I am trying to use the pdf to write a report.  This involves being able to search (which doesn't work if it is not OCR'd), and I sometimes wish to copy text, and I OFTEN need to extract data from tables in the reports so I can work with the data in a spreadsheet. 

 

Basically what has worked in the past is to use what is now the "scan and ocr" tool on the report.  It used to skip pages that were already readable and OCR those that weren't.  When it did that, it would correct the orientation, so that landscape pages were readable.  If it didn't work, I would "print" some pages as a new pdf (perhaps cropping something extraneous, like a header), and when I would do the OCR on THAT it would work.  (Or sometimes not.  Haha.) 

 

Oh.  Well, I take that back.  I also add bookmarks (is that editing?), and stuff like that to help me navigate efficiently.

 

As I have already mentioned, I am not the one who makes the reports. I am the end user (three layers removed, as a contractor). The people who write the reports are instructed to submit the reports as PDFs. Some are scanned from printed material. Some (like this one) were done in word and (I think) Word printed it as a pdf. One time I received a report that someone wrote using Word and then used some kind of software to turn into an IMAGE pdf for no reason that I can discern. ALSO, I do sometimes get reports where almost the whole thing is a properly done, searchable text pdf, including the tables.  I know it is possible to do it.  I also know that I have previously managed to work with many (not all) of the documents that come to me in imperfect formats, although it is more difficult to work with the tables.

 

You will have to believe me when I tell you: I simply do not have the ability to require this be done correctly on the front end. Neither do I have the ability (or permission!!!!) to directly contact the report writers to question them. If I ask my boss to ask the people who hired us to go back to the people who wrote the reports, he will not do it. I know this because I just asked him. He took the work away from me, and gave it to someone else, and later on I will get to edit their work.  While I check behind them I will have to work with a 561 page document that is a pdf on my computer. I will have to navigate through it as though it is a book because only the first 41 pages of it will be searchable. [It is not doable for me to print out something this large on my home printer.]

 

Also, my employer does not buy the software. I have to buy my own software, and I bought this... thinking it would work as well as Adobe 9. I actually do not have unlimited funds to buy (or actually subscribe to) all different kinds of software, for example Abbyy. So it is upsetting that the software apparently does not simply and easily do the same thing as Adobe 9.

 

You are correct that I am having multiple issues, and not all of them are with Acrobat, obviously.  Again.  Thank you.  And I apologize that this is all out of order and random.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines