Copy link to clipboard
Copied
Hi,
I work a lot with searching in pdf:s. Sometimes I find a pdf on a website, that is NOT searchable. I have to OCR it first, and by that, it's often messy to watch, and it takes time.
If I contact the company, and get a emailed version of the pdf, it is fully searchable!
(And I do know it's not a photo or a scanned copy)
Why is this? What have they done (by purpose or without) to get the online version non-searchable?
Can they have done something on the original file before publishing online? (and why?)
Obviously it's not the case with all online pdf:s, I browse through a lot every day, almost all works (except these odd ones). I use Adobe Acrobat Pro DC 32-bit, my browser is Chrome/Firefox. I have not figured out the common factor yet, so I hope to get some expert help from the community!
Thanks in advance
/Camilla
Copy link to clipboard
Copied
How are you saving the files you view online?
Copy link to clipboard
Copied
I rarely save them, I open them in Acrobat.
IF I save them, I sometimes use the option to save in the browser, sometimes I save from Acrobat, to my Dropbox.
But the error is not when I check my saved copies!
The error is when I want to search in a pdf (most times it is online)
Example here:
When i open it (lets say in Chrome) I want to search in it (I use the short Ctrl+F). Nothing can be searched for! The same if I open it in Acrobat and search
But I have emailed the company, and got a pdf by email, and that one works really fine to sarch in!
So my question is: How can I ask these companies to do, so that their online list will be searchable?
(I think like " never save your pdf as xxx" or "be aware to untick this box when saving it" or "if you upload the pdf, pls check so xxx is the format" or whatever the solution is. I am happy to play around myself and learn by doing, but this case have fooled me a long time now...
Copy link to clipboard
Copied
Sorry, the link might be hard to find...
scroll down and click on VIN
Then on Fullständig Vinlista
Sorry, I thought the link was to the pdf directly...
Copy link to clipboard
Copied
In this case the file has no text. Well, some spaces, but the wine list is a set of tiny pictures looking like text. There is nothing to search for. Acrobat Pro has an advanced feature of doing OCR if it needs to. This is not in the free Reader, and it isn't in browsers.
So, the file you were sent was made differently.
Copy link to clipboard
Copied
Ah, ok thanks! how can they have done it, likely? (I know they havent printed the original file and scanned it) They must have done something to the file, and that's what I look for to know...
I have Acrobart Pro so I can OCR, but it takes time and it can appear funny afterwards.
Copy link to clipboard
Copied
Chrome uses its own internal PDF plugin, unrelated to Adobe. If you're having problems with it report it to Google. You won't be the first, as it's known to be quite problematic, as well as practically all other browser plugins that display PDFs.
For best results you should save the file and then open it in Acrobat directly.
Copy link to clipboard
Copied
Well, I have also saved the files to check them out in Acrobat. For this example I had here, I tried it again just now, and it is NOT searchable when saved and opened in Acrobat. So I should say its a cheatty way to blame Chrome (which I often do for a number of other things!).
The question again (slightly modified):
How can I explain to the creator of this pdf (not me) to behave, so that the pdf he/she uploads to the site is searchable? (wheather I open it in my browser or save it and open it in Acrobat).
Or, a more simple question: Is there a way when I create a pdf, to make it "non searchable" (if I for any reason would want that)?
Understand if no-one have an answer to this. I have tried a lot to find a solution, but not found any. Am just at a beginners level on this, I can handle most common features but not when it goes deep and demands a pro knowledge. That's what I was trying to find here... well I might not be so lucky after all, if all there is is to blame third partys software...
Copy link to clipboard
Copied
Maybe, (I tried to investigate a bit more) some non-searchable pdf:s comes from a word doc originally. Not sure, but as I can find out...
Is there something odd they do when converting the word doc into pdf, that makes the pdf non-searchable?
I do it sometimes myself, but have never had any problems with this, all my documents I convert is all fine to search in!
(I have some sort of plugin to my Office, where I can easy save as pdf, but guess that's not a standard. How do they do it? No idea...)
Before I start telling them how they should do to make the pdf.s searchable, it would be nice to know the solution (or at least some options)
Is there something like "do not print as a pdf" or "please do print as a pdf" or other general solutions that could be suitable for non-pros? I need general guidelines but for these specific cases.
Again, very thankful for all input on this but understand if the answer is out of this community's knowledge. Surely out of my range of skills!
Copy link to clipboard
Copied
"Is there something odd they do when converting the word doc into pdf, that makes the pdf non-searchable?"
May be that they uses bad conversion software.