Skip to main content
Participant
August 13, 2020
Question

Command line exporting pdf to text

  • August 13, 2020
  • 2 replies
  • 5112 views

Is there a command, coming with Acrobat Reader, to export pdf to text?
I have an application to search files for content, and it would be better if pdf files could be searched. Formating of the output is not important, except that it is best if words are separated by at least one blank.

I have MiKTeX 2.9, where I found commands (bat files) pdf2ps and ps2ascii, with use
pdf2ps pdfinfile tmpPsfile
ps2ascii tmpPsfile txtoutfile

But aternatives are interesting.

Stig Rosenlund

stig.ingvar.rosenlund@gmail.com
stig.rosenlund@sverige.nu

This topic has been closed for replies.

2 replies

Legend
August 14, 2020

It includes an ifilter which is Microsoft's text extraction infrastructure.  No documentation from Adobe because it is an MS standard. 

Stig5EAFAuthor
Participant
August 14, 2020

Thanks. What are the commands, executable from the command prompt, that use this ifilter? Arguments, results? My search application is part of my programming language Rapp. I need commands available without downloading more special programs. I recommend the users of Rapp to download MiKTeX, so pdf2ps and ps2ascii are available if they have downloaded it. Commands already present in Windows, if Reader is installed, would also work. If those are faster than pdf2ps and ps2ascii I would use them.

Legend
August 14, 2020

This is not part of the command line world. Microsoft declared it dead 25 years ago. It's been slow-a-dying, but for the real meat in Windows you don't look for command lines but COM interfaces.

try67
Community Expert
Community Expert
August 14, 2020

No, Reader can't do it, but plenty of other applications can, including ones that can be used from the command-line.

If you're interested I could develop for you (for a fee) a custom-made tool that will export the textual contents of a PDF file (or files) to a text file, or even just search the file for a specific term and then do something with it if a match is found.

You can contact me via [try6767 at gmail.com] to discuss it further.