PDF generated with unreadable characters for Googlebot crawling (and for copy-pasting)

Explorer ,
Feb 26, 2022 Feb 26, 2022

Copy link to clipboard

Copied

Source: InDesign 2022 documents

Output: PDF (any format) with Embedded Fonts (western alphabets)

Problem: On screen and print the PDF appears OK, when crawled by Googlebot the text is a bounch of "garbage" unreadable charaters.

Note to readers: Please do not suggest the "Copy-With-Formatting" option as solution. I'm talking of Google crawling and search indexing in this post.

Link to PDF as example.

TOPICS
Bug , Import and export , Publish online

Views

98

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Feb 26, 2022 Feb 26, 2022

Copy link to clipboard

Copied

Quite honestly, if it exports correctly and views correctly in Acrobat, I'd think it's one of two things.

First, the embedded fonts are encrypted in standard Adobe practice, and that's somehow confusing Google, even though it should be reading the raw text and ignoring things like font and layout.

 

Second... it's Google's problem. 🙂

 

|| Word & InDesign to Kindle (& EPUB): a Professional Guide (v2 now on Amazon!)

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Feb 26, 2022 Feb 26, 2022

Copy link to clipboard

Copied

I exported to Word and it came out with a lot of "garbage" characters. I suspect it is the font--as a test, try another font, such as an Adobe font.

David Creamer
Adobe Certified Instructor, Adobe Certified Professional, and Adobe Certified Expert (since 1995)

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Feb 26, 2022 Feb 26, 2022

Copy link to clipboard

Copied

Ah, didn't think to try that (but then, I am wary of downloading and messing with files, even here in a fairly safe zone). Still not sure how a font, which AFAIK is only called on at rendering/display time, could mangle the text that is in theory more clear at a bot-search level.

 

Strange. I've never heard of an unreadable PDF, in English, and that works in every other way.

 

|| Word & InDesign to Kindle (& EPUB): a Professional Guide (v2 now on Amazon!)

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Feb 27, 2022 Feb 27, 2022

Copy link to clipboard

Copied

UPDATE: the issue seems solved flagging "Create Tagged PDF"

Screenshot 2022-02-27 at 22.51.30.png

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Feb 28, 2022 Feb 28, 2022

Copy link to clipboard

Copied

Hi Mark,

just to make it clear, your solution was to enable the option:

[x] Create Tagged PDF

 

Thanks,
Uwe Laubender

( ACP )

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 01, 2022 Mar 01, 2022

Copy link to clipboard

Copied

Exactly.

You need to enable the option: [x] Create Tagged PDF

to obtain a PDF that is correctly readable, and therefore indexable, from Googlebot & friends.

 

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Mar 01, 2022 Mar 01, 2022

Copy link to clipboard

Copied

That feature is enabled by default. The PDF sample you uploaded was tagged. 

David Creamer
Adobe Certified Instructor, Adobe Certified Professional, and Adobe Certified Expert (since 1995)

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 01, 2022 Mar 01, 2022

Copy link to clipboard

Copied

LATEST

Since yesterday, on website have been updated with Tagged PDFs. I could not edit the post with link.

On my Indesign 17.1.0.50 Create Tagged PDF it's unchecked by default on all presets, except for:

High Quality Print.

That's kinda weird since that preset is used to send PDFs to printers and not for posting on websites…

 

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines