Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

text encoding errors when copying and pasting from pdf

New Here ,
May 17, 2021 May 17, 2021

I have trouble copying and pasting text out of a pdf.

The pdf displays allright in acrobat reader and acrobat dc but when I try to copy/paste or export the text (no matter if I try to export to word, rtf etc.) some characters (not the Umlauts!)  are misinterpreted rendering the text illegible. for instance"ft" becomes "="

How can I change the encoding when saving as Acrobat file?

thanks for any input on that matter

TOPICS
Edit and convert PDFs
3.3K
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
May 17, 2021 May 17, 2021

This is an issue with how the file is created. Did you create it? If so, how?

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
May 17, 2021 May 17, 2021

unfortunately not. I got these files by my client who is most likely using microsoft 365. Looking up this problem with google (meaning "Word" AND "encoding") brought up only posts from 2016 as the most recent...

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
May 17, 2021 May 17, 2021

what I do not get is, when acrobat displays it correctly why cant I also somehow get it to copy/paste the right thing?

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
May 17, 2021 May 17, 2021
LATEST

Think of the Wingdings font. What you see on the screen is not the actual character you type in order to get that symbol, and if you copy it from Word and paste into Notepad, for example, it will not appear the same. There's a mapping between the actual character and the symbol it represents within that font. This is more or less what's happening in your case: You see one character, when in fact it's another. This needs to be solved in the source document, or the file has to be re-created from scratch (for example by exporting all pages as images and then creating a new PDF file from those images and running Text Recognition on it).

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines