Skip to main content
Participant
October 30, 2019
Question

Copy-pasting replaces digits in some PDFs with Hebrew

  • October 30, 2019
  • 0 replies
  • 282 views

In some PDF files when I try to copy+paste (or extract text by any known means) the digits in the pasted text are substituted with other digits, with no discernible logic, e.g. "1995" becomes "1001" (try the attached file, for example). The number of digits is always preserved though.

The problem is consistently reproduced on different computers and operating systems using both the newest and older versions of Acrobat Reader. The digit replacement "rules" are self-consistent for each file, but differ in each affected file.

NOTE 1: The text itself is pasted perfectly, so this isn't an issue of a missing font or something - we're talking about normal ASCII digits. This actually makes the problem worse because it's easy not to notice the changes and numbers are often the most improtant part of data.

NOTE 2: I've only encountered this problem in files with Hebrew text, though it doesn't mean that this problem doesn't exist with other languages.

Any ideas why this is happening, and how to solve or at least to auto-detect this problem?

 

    This topic has been closed for replies.