Skip to main content
Participating Frequently
June 16, 2021
Question

Two pdf:s - one searchable and on are not! Why?

  • June 16, 2021
  • 2 replies
  • 1520 views

Hi,

I work a lot with searching in pdf:s. Sometimes I find a pdf on a website, that is NOT searchable. I have to OCR it first, and by that, it's often messy to watch, and it takes time.

If I contact the company, and get a emailed version of the pdf, it is fully searchable!

(And I do know it's not a photo or a scanned copy)

 

Why is this? What have they done (by purpose or without) to get the online version non-searchable?

Can they have done something on the original file before publishing online? (and why?)

 

Obviously it's not the case with all online pdf:s, I browse through a lot every day, almost all works (except these odd ones). I use Adobe Acrobat Pro DC 32-bit, my browser is Chrome/Firefox. I have not figured out the common factor yet, so I hope to get some expert help from the community!

 

Thanks in advance

/Camilla

This topic has been closed for replies.

2 replies

Participating Frequently
June 17, 2021

Maybe, (I tried to investigate a bit more) some non-searchable pdf:s comes from a word doc originally. Not sure, but as I can find out...

Is there something odd they do when converting the word doc into pdf, that makes the pdf non-searchable?

I do it sometimes myself, but have never had any problems with this, all my documents I convert is all fine to search in!

(I have some sort of plugin to my Office, where I can easy save as pdf, but guess that's not a standard. How do they do it? No idea...)

Before I start telling them how they should do to make the pdf.s searchable, it would be nice to know the solution (or at least some options)

Is there something like "do not print as a pdf" or "please do print as a pdf" or other general solutions that could be suitable for non-pros? I need general guidelines but for these specific cases.

 

Again, very thankful for all input on this but understand if the answer is out of this community's knowledge. Surely out of my range of skills!

Bernd Alheit
Community Expert
Community Expert
June 17, 2021

"Is there something odd they do when converting the word doc into pdf, that makes the pdf non-searchable?"

 

May be that they uses bad conversion software.

try67
Community Expert
Community Expert
June 16, 2021

How are you saving the files you view online?

Participating Frequently
June 16, 2021

I rarely save them, I open them in Acrobat.

IF I save them, I sometimes use the option to save in the browser, sometimes I save from Acrobat, to my Dropbox. 

But the error is not when I check my saved copies!

The error is when I want to search in a pdf (most times it is online)

Example here: 

https://teatergrillen.se/wp-content/uploads/sites/9/2021/05/Weblista-21-05-07-.pdfhttps://teatergrillen.se/

When i open it (lets say in Chrome) I want to search in it (I use the short Ctrl+F). Nothing can be searched for! The same if I open it in Acrobat and search 

But I have emailed the company, and got a pdf by email, and that one works really fine to sarch in!

 

So my question is: How can I ask these companies to do, so that their online list will be searchable?

(I think like " never save your pdf as xxx" or "be aware to untick this box when saving it" or "if you upload the pdf, pls check so xxx is the format" or whatever the solution is. I am happy to play around myself and learn by doing, but this case have fooled me a long time now...

 

Participating Frequently
June 16, 2021

Chrome uses its own internal PDF plugin, unrelated to Adobe. If you're having problems with it report it to Google. You won't be the first, as it's known to be quite problematic, as well as practically all other browser plugins that display PDFs.

For best results you should save the file and then open it in Acrobat directly.


Well, I have also saved the files to check them out in Acrobat. For this example I had here, I tried it again just now, and it is NOT searchable when saved and opened in Acrobat. So I should say its a cheatty way to blame Chrome (which I often do for a number of other things!).

 

The question again (slightly modified):
How can I explain to the creator of this pdf (not me) to behave, so that the pdf he/she uploads to the site is searchable? (wheather I open it in my browser or save it and open it in Acrobat).

Or, a more simple question: Is there a way when I create a pdf, to make it "non searchable" (if I for any reason would want that)?

 

Understand if no-one have an answer to this. I have tried a lot to find a solution, but not found any. Am just at a beginners level on this, I can handle most common features but not when it goes deep and demands a pro knowledge. That's what I was trying to find here... well I might not be so lucky after all, if all there is is to blame third partys software...