• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
2

Acrobat problems with patent documents

Explorer ,
Mar 23, 2020 Mar 23, 2020

Copy link to clipboard

Copied

For years I have seen the same OCR problems with PDFs of U.S. patent documents, particularly of a certain vintage (say mid-2000s) or older. Typically these are PDFs downloaded from Google Patents, though others come to me by email from other people so it's unclear where they originated. (The US Patent & Trademark Office does not store patents in PDF; they inexplicably still use TIFF.) The main problem is that "fi" ligatures show up as unrecognized ("?") when you copy text to the clipboard. Other OCR problems include lower-case w routinely showing up as upper-case W.

 

The other problem I would like to report is Acrobat's inability to support text selection on the two-column layout of patent documents. Various things fool it into selecting text from the other column, including (but not limited to) hyphens. 

 

It would be great if Adobe could finally fix this; it's been a problem for many years.

 

Thanks,

 

- bill. 

TOPICS
Scan documents and OCR

Views

577

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Mar 23, 2020 Mar 23, 2020

Copy link to clipboard

Copied

What version of Acrobat do you run today (e.g. 11.0.33, 2016.123.92323)? Not "latest" please.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 31, 2020 Mar 31, 2020

Copy link to clipboard

Copied

The version is 2020.006.20042. But this is irrelevant as I have had this problem for years, through many updates.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 31, 2020 Mar 31, 2020

Copy link to clipboard

Copied

Can you share a sample file with us? You can attach it to the original message using the tiny paperclip icon at the bottom when you edit it, or upload it to a file-sharing website (like Dropbox, Google Drive, Adobe Cloud, etc.), generate a share link and then post it here.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jan 16, 2024 Jan 16, 2024

Copy link to clipboard

Copied

Re: text selection on the two-column layout 

 

Have you tried holding down the "alt" key while "left clicking and dragging" a box on a column? Note the cursor must be the "text selector / I-beam" when you start "drawing" the box around the text. e.g. your cursor must be currently over selectable text when you first click it or it doesnt select anything.

 

I know that doesn't completely solve the issue (as you can only copy from a single column at a time - e.g. you are not able to select the bottom of one column and then continue to the top of the next column), but it is something.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Oct 01, 2024 Oct 01, 2024

Copy link to clipboard

Copied

LATEST

Unfortunately this does not solve the problem.  I've had this problem for YEARS with multicolumn PDF's, most common with U.S. patents and patent publications.  I'm going to try and attach a screen shot showing how I began and ended the highlighting with the word "The" and ended with "cartridge" all in column 1, but it captures parts of column 2 in a very strange way due to how Adobe identifies the flow of the text.Adobe Reflow Multicolumn Problem.jpg

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines