How to make searchable PDFs from image PDFs
Copy link to clipboard
Copied
Hello:
I have saved a number of scanned documents (using FineReader) as image-only PDFs. However, I may need a number of them searchable in the future, so I have a couple of questions:
• Is there any way to quickly identify those PDF files that are image-only vs searchable?
• How do I convert image-only PDFs to searchable PDFs (using Adobe Acrobat or FineReader?)?
Thank you for reply. It will save me time and grief.
Hans L
Copy link to clipboard
Copied
Hi Hans,
This is all pretty easy.
To identify PDFs that are searchable, open them up and click on the text. If a cursor appears at the area you clicked, it's searchable. if you see a transparent blue window show up (or if you marquee out a region and that region is transparent blue), it's not searchable.
To make one or many documents searchable, do the following:
Whoops, wait, which version of Acrobat are you using and what release? The answer can vary my answer. (also, just to make sure, what is your OS and what release/version is it)?
Thanks,
Copy link to clipboard
Copied
Hello Gary, and thanks.
I use Windows 10 Pro, x64, v. 1809.
Acrobat XI, v 11.0.23.
Any more automatic way of 'running through' pdf files to see which ones are images and which ones are searchable? I have quite a lot of them 😞
Hans L
Copy link to clipboard
Copied
Hi Hans,
OK no problem. I was wondering if you needed a One-at-a-time solution.
If you have a folder of files, searchable and non-searchable, this should work (my caveat here is that I have Acrobat DC Pro and I'm not 100% sure that this will work with XI, so you will have to test:
Open up the workspace called Enhance Scans. Within that look for "Recognize Text," Select the option "In Multiple Files." In the upper left you'll see an option to Add Files in a dropdown menu. Click on that and select Add folders. From there select the folder you want to convert and select that folder. This will then offer to recognize text on the files. Click OK.
My only caveat is that at one time, Acrobat would stall if it found a file that was already converted, I think this was fixed before XI, but not fully sure.
So please try this and see if it does what you want it to.
Good luck!
Copy link to clipboard
Copied
Like Gary said to Hans 5 years ago, this is all very easy. Here is a step by step manual. It's for Mac though, not for Windoof. Not sure if it will wöök foa ju. There are dividend wööschns.
##### Saving images from Facebook Messenger and turning them into searchable PDFs.
1. [ ] Download image to folder. It is saved as a PNG file.
2. [ ] Open image in Apple Preview. If you open it in Adobe Acrobat, it will not be able to read the file in a searchable manner.
3. [ ] Export image as JPEG.
4. [ ] Open Adobe Acrobat.
5. [ ] Go to Tools.
6. [ ] Go to Create & Edit section.
7. [ ] Click on Scan & OCR.
8. [ ] Click on "Or recognize text in multiple files". (Does not have any visual, only text. It's easy to miss.)
9. [ ] Be prepared for a bright-ass white screen in your face despite having Dark Mode activated.
10. [ ] Click on "Add Files..."
11. [ ] Click on "Add Folders..."
12. [ ] Navigate to the folder that contains the images you want to convert to screen-reader accessible PDF files.
13. [ ] Click on "Choose" although nothing is selected and you're just in a folder. Remember, you're choosing the folder, not a specific file.
14. [ ] See how all the files that were in the folder you chose now appear in the white background list without a folder name and just the file names. That's what you want. Right above it it says "Recognize text using OCR on a set of files." OCR stands for "Optical Character Recognition". It means the same as "recognize text" except in a TLA to provide redudancy, confusion, acronyms, repetition, and an opportunity for an Oxford comma in the documentation.
15. [ ] Click OK.
16. [ ] You will now "see" a (bright) window with "Output Options". For "Target Folder" choose "A Folder on My Computer", not "The Same Folder Selected at Start" because otherwise the output files will be in the same folder as the images you're trying to make accessible and that will be super messy to separate apart. It's much better to use a separate folder.
17. [ ] Once you click on "A Folder on My Computer", not options will now appear on the screen. You will see a "Choose..." button and then a white form field box that looks like it is editable, but it is not. It contains the text: "Please choose the folder for the output location." It is very user-friendly because you can just double-click that line of text (or multiple lines, depending on your font size), and then copy/paste it. That's how I got it over here. Adobe really thought that one through and I'm grateful for it.
18. [ ] Select "Keep original file names" under "File Naming" and not "Add to original file names:" because that would mean you'd have to change a whole bunch of stuff if you want to have an inaccessible and an accessible version of the PDFs you're creating. And since you're starting with PNG or JPG files, you might as well keep the same file names, since the ending will be different. I think that would make it easier to work with the files going forward because then you'll have the file names they had originally and they'll just have a dividend ending. You can just delete the original images and just keep the PDFs, which are now in a separate folder. It would be nice if it could just delete the original images nobody needs anymore but until Adobe gets their act together when it comes to accessibility, you'll have to do it manually and think that making an image screen-reader accessible is the same as "enhancing" a "scan".
19. [ ] You can keep the box for "Overwrite existing files" checked or uncheck it. It doesn't make a dividend because the original file names have a dividend ending, so it's completely irreverent.
20. [ ] Click OK.
21. [ ] Now you will see an error message that tells you that you forgot to specify the output folder location. If you don't have one already... Well... Regardless... First, you have to navigate to the folder where your original files are. Unless that's not where you want the new files, but that would make no sense.
22. [ ] Press "New Folder" in the bottom left corner of the window, then type a name for the output folder, such as "output".
23. [ ] Navigate to that folder, typically with a double-click if you have the abiltity to operate a mouse or similar device and have the dexterity to perform a double-click.
24. [ ] Press "Choose" to select the output folder although nothing is selected. Same thing as above.
25. [ ] Click "OK".
26. [ ] A new window will appear that asks you to select the "Document Language". Most of my documents are in "English (US)", but it really depends on your use case. Hopefully you'll know what language the document is in before you're able to read it because otherwise you're... Statute of Limitations or something. Can't remember.
27. [ ] Under "Output", you have 3 options:
1. [ ] Searchable Image
2. [ ] Searchable Image (Exact)
3. [ ] Editable Text and Images
28. [ ] I don't know what each of them does and I don't know how to continue a numerated list item in Obsidian after placing a nested enumerate environment within an item because this isn't LaTeX, so now it's a new item and that'll have to do.
29. [ ] I don't know what each of the options does, but I generally like things to be exact, so I typically choose the middle option. (I hope you remember that there were 3 and that 2 is the middle.) If you do that, the selectable options for the next field, "Downsample To" become greyed out, so you have less work to do, so that's nice.
30. [ ] Click OK.
31. [ ] Wait.
32. [ ] Now the process is done, but it won't tell you. Hopefully you didn't walk away and forgot what you were doing. If everything went well, you can now open Finder or something and navigate to the output folder you created for the accessible PDFs. Then you can delete the original images, both the PNG and the JPEG files (although I thought at some point we had agreed that the JPEG file ending was just JPG, but I guess either way is fine.) You can then move the PDF files from the output folder to the parent folder and then delete the output folder.
33. [ ] Now you should have a folder with screen-reader accessible PDF files with the same file names as the original images in the same folder as the original images. It's all very easy.
34. [ ] I really hope this manual isn't irreverent and helps you in your daily life.
35. [ ] On behalf of Adobe, I apologize for the inconvenience.
36. [ ] For a refresher on Title Case and a handy (not German cell phone) conversion tool when you can't remember how its spelt, visit www.titlecase.com.
37. [ ] P.S.: If you try to do this process with the PNG images saved directly from Facebook or Meta or whatever Messenger...
38. [ ] P.P.S.: I also tried using the free tools from Aspose Dot App (which I thought had an appealing ass name (Google YouTube Ismo Ass)) as well ass Small PDF Dot Com and both just produced junk.
39. [ ] Eim ättätsching ßamm rellewent sweenschotts in käys ju käa.
Copy link to clipboard
Copied
Only 10 Ettätschments aloud. Here are some more.
Had to combine two of them because it was a does'n at first. But now it's 10.
I hope this works for you. Schutt be easier to read if you prefer Dark Mode.
I sink it's out of order because of AI. Or because of the Legacy thing I used because of the export. Because I wanted to change the resolution, but then it makes the feil nayms all weird and...
I mean... It has numbers. You can figure it out.
Copy link to clipboard
Copied
38. [ ] P.P.S.: I also tried using the free tools from Aspose Dot App (which I thought had an appealing ass name (Google YouTube Ismo Ass)) as well ass Small PDF Dot Com and both just produced junk.
By Tommes Coughmen
I know this comment is a bit dated, but I wanted to share how much Aspose.OCR has evolved. Back when it was just starting out, it was a much simpler tool. Now, it’s grown into a robust product with significantly improved features and accuracy. The engine can extract content from images in multiple languages, handling various fonts, styles, and angles. Additionally, they've recently introduced .NET plugins at an affordable price—details are available in their blog post. The PDF to Searchable PDF app is also worth checking out if you're interested.

