Skip to main content
Participant
June 29, 2026
Question

Cesar cipher

  • June 29, 2026
  • 1 reply
  • 0 views

Has anyone delt with mapping issues and potential font transformations making my pdf illegible but only for specific characters, lines / words and relevant phrases. 
 

the result is felt at the parsing level not typically the human read level 

I know it’s quite vague but it’s also not buried in the weeds where it seems to want me to live. lol 

thanks 

    1 reply

    Amal Jaiswal
    Community Manager
    Community Manager
    June 30, 2026

      

    Hi @10546 

    Hope you are doing well andt hanks for the detailed description, this is actually a common and frustrating issue with PDFs where the text looks fine but doesn't extract or parse correctly. It almost always comes down to font encoding or a broken character mapping.

    Please try the steps below and see if that works:

    1. Check if it's a scan vs. real text. Open the PDF, select some of the affected text with the Selection tool, and copy-paste it into a plain text editor (like Notepad). If it pastes as gibberish or symbols, that confirms a font/encoding mapping issue rather than a visual rendering bug.

    2. Run Acrobat's built-in OCR/Text Recognition. Go to Tools > Scan & OCR > Recognize Text. Even if the document already has "real" text, re-running OCR can rebuild a clean ToUnicode mapping and often fixes invisible-but-broken character mapping.

    3. Check the font embedding. Go to File > Properties > Fonts tab. Look specifically at the fonts used for the affected words/lines. If you see a font listed as "(Embedded Subset)" using a custom or non-standard encoding, that's frequently the culprit, subsetted fonts sometimes remap glyph IDs in ways that look correct visually but break parsing/extraction.

    4. Try "Save As" to a fresh PDF. Use File > Save As Other > PDF, or better, print to a new PDF (Print > Adobe PDF as the printer) and see if the issue persists in the regenerated file. This tells us if it's a structural issue with the file itself vs. something else downstream.

    5. If it's isolated to specific characters/phrases, check whether those happen to share a font, were pasted in from another source (like a Word doc or web page), or contain special characters. Mixed-source documents are a common cause of this exact "only some characters" pattern.

    Could you let us know:

    • Is this PDF originally a scan, or created digitally (e.g., exported from Word/InDesign)?

    • Does the issue happen with one specific file, or all PDFs you create/open?

    • What tool is doing the "parsing" downstream, is it your own script, another app, or something else?

    That'll help us pinpoint whether this is a font subsetting issue or something specific to your workflow.

     

    ~Amal