Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
12

OCR and line numbering

New Here ,
Nov 03, 2023 Nov 03, 2023

Dear Colleagues,

We would need help with this issue in Acrobat Pro. We use OCR many times and need to resolve this.

Martin333091528ga0_0-1699027022962.pngexpand image

The recognition of text is very good, but whatever the setting is, the line numbering is always output as a text, what is very time-consuming to delete. Do you have any ideas how to adjust PDF so that the output in Word is such that we can erase those numbers of line very quickly? Thank you very much in advance. 

Best regards,

Martin

TOPICS
Edit and convert PDFs , PDF , Scan documents and OCR
921
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Nov 03, 2023 Nov 03, 2023

In Acrobat Pro redact the line numbers. 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Nov 03, 2023 Nov 03, 2023

To elaborate a bit: Use the Mark for Redaction tool on the first page to draw a rectangle over the area where the line numbers appear. Assuming that's the same for all pages, right-click the comment and select "Repeat mark across pages". Then apply the redactions and export the text.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Nov 04, 2023 Nov 04, 2023

If you do not want the line numbers, why do you not crop them out during your scanning process?

 

Also, remember that Acrobat cannot scan; it utilizes software called TWAIN to access your scanner's software. So, you have complete control of the process from your scanner's software. So, simply: if you do not want the numbers: don't scan them in the first place!

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Dec 27, 2023 Dec 27, 2023

I get my scanned pdfs from the court or other lawyers, I dont have the option of not scanning the line numbers.

 

It would be great if Adobe AI could recognize the vertical line of numbers half an inch to the left of every line of text, consider the possibility that it might be unwanted line numbering, and have an option of not including those numbers when exporting as word doc.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Dec 31, 2023 Dec 31, 2023
LATEST

Have you tried to redact the numbers as suggested above?

 

Also, depending on how and with what the PDFs were generated, it's possible that the numbers might be on their own layer and simply deleting that layer in Acrobat.

 

One other question: are you receiving the PDFs with text as an image (and you need to do the OCR process), or, are you receiving the PDFs as already searchable? (Just verifying this issue.)

 

On an aside, You can determine if the following is worth the time, but if you open the PDFs in Photoshop, you can crop them and save them out as a Photoshop PDF. These tend to be much larger in storage size. So, if that's an issue, you can then open these up in Acrobat and resave them as "Reduced sized" and they will then be more "normal" in storage size.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines