Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

Can you OCR, or edit, text of a PDF without the program rotating the page?

Explorer ,
May 24, 2023 May 24, 2023

I have a few PDFs that need to be edited, but I would like the page layout to remain the same. The problem I'm having is that when I choose to edit the text, the OCR runs automatically and rotates the page & text boxes, so everything is horizontal. This rotates the entire page. You can see the page edges I made in the examples.

Is there a way to edit the text without the text boxes & page rotating? I want to be able to edit the text at it's original angle? Here are 2 examples of what I'm explaining. Page 1.pdf is the original. Page 2.pdf is what happens to the page when select the Edit Text box to change some of the text. You will notice that the page stays fine, but is uneditable when the Recognize text box is unchecked, but there are no text fields available to edit. Once checked, the text is editable, but the page rotates to make the text boxes horizontal. Is there a way to edit the text at an angle and have the page layout stay the same? ---- These are just example pages. They don't mean anything.

 

Thank You

TOPICS
Edit and convert PDFs , PDF , Scan documents and OCR
3.3K
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
1 ACCEPTED SOLUTION
Community Expert ,
May 25, 2023 May 25, 2023

Yes, I'm sorry that I didn't have better news.

 

Ostensibly, Acrobat should be able to take the twisted page, do what it needs to do, and return it to its twisted state. Unfortunately, (for you) Adobe takes the position that "everyone wants their pages straight." So, after the initial straightening, it just leaves it like that.

 

Best, and good luck!

View solution in original post

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
May 24, 2023 May 24, 2023

Hi, @Sean_McD, I'm confused. When I read your issue, I assumed the text was 90° off of the page alignment. But, when I looked at your examples, they appeared to be 10-15° off. Is the text originally on the page 10-15° off? If so, why?

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
May 24, 2023 May 24, 2023

Most of these pages are xeroxed forms that were copied numerous times, so the text is not all lined up horizontally. Because of this, if I go in and select Edit Text, the page is automatically OCR'ed and the page rotates to make the text horizontal. When this happens, corners of the page are cut off and it's not layed out like the original page. How can I stop it from automatically rotating the page when I go to the Edit Text tab? Is there a way to keep it, as is, and still be able to edit the text?

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
May 24, 2023 May 24, 2023

Oy, I do not think there is a way. For OCR to do its thing, text that is not on the level is not really possible. Then, when you add that these are xeroxs' of xeroxes, you have image degradation going against you. 

 

The only thing I can think of (and you're not going to like this) is to scan each errant page, open the page in Photoshop, straighten and use Levels to limit the gray page surface and enhance the black of the text, and then save the documents. The thing you need to work out is whether it will be faster to do what I just said or to retype it over again from scratch. 

 

For more information on enhancing an image can be decerned from this blog I wrote for Adobe some time back.

 

https://community.adobe.com/t5/adobe-community-professionals/scanning-clean-searchable-pdfs/m-p/4785...

 

Good luck!

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
May 25, 2023 May 25, 2023

That's what I was afraid of, but it's not a question of fixing a few pages that were scanned. We scan over 50k pages a day. We don't always have to edit or fix things, but when we do, Photoshop might be the best way. It just adds more time.

Most of our clients don't have "dirty" files like that, but some do. Attorney's have to share all files, so there's copies of copies that are filed and scanned. Other clients have forms that aren't reprinted, but just copies of old prints. That's where this happens. I did learn to run the OCR with Scan & OCR and set the program to run "Searchable Images (Exact)" and it runs the OCR on the pages, as is, and nothing is changed. That's just why I was wondering if there was a way to edit a page the same way, as is, and nothing is automatically changed.

 

Thank you for your help. It saved me a lot of time instead of spending hours trying to find a way that doesn't exist.

 

Sean

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
May 25, 2023 May 25, 2023

Yes, I'm sorry that I didn't have better news.

 

Ostensibly, Acrobat should be able to take the twisted page, do what it needs to do, and return it to its twisted state. Unfortunately, (for you) Adobe takes the position that "everyone wants their pages straight." So, after the initial straightening, it just leaves it like that.

 

Best, and good luck!

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Contributor ,
Apr 07, 2024 Apr 07, 2024
LATEST

Have you tried the solution suggested here?
https://community.adobe.com/t5/acrobat-discussions/how-do-i-recognize-text-under-the-enhance-tools-w...
Basically, if you instead of clicking Recognixe text, click on Enhance and change the settings (deselecting unskew) and the click the Enhance button, you should be able to OCR without the text being deskewed.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines