Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
0

Command line exporting pdf to text

New Here ,
Aug 13, 2020 Aug 13, 2020

Copy link to clipboard

Copied

Is there a command, coming with Acrobat Reader, to export pdf to text?
I have an application to search files for content, and it would be better if pdf files could be searched. Formating of the output is not important, except that it is best if words are separated by at least one blank.

I have MiKTeX 2.9, where I found commands (bat files) pdf2ps and ps2ascii, with use
pdf2ps pdfinfile tmpPsfile
ps2ascii tmpPsfile txtoutfile

But aternatives are interesting.

Stig Rosenlund

stig.ingvar.rosenlund@gmail.com
stig.rosenlund@sverige.nu

TOPICS
Edit and convert PDFs

Views

4.3K
Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 14, 2020 Aug 14, 2020

Copy link to clipboard

Copied

No, Reader can't do it, but plenty of other applications can, including ones that can be used from the command-line.

If you're interested I could develop for you (for a fee) a custom-made tool that will export the textual contents of a PDF file (or files) to a text file, or even just search the file for a specific term and then do something with it if a match is found.

You can contact me via [try6767 at gmail.com] to discuss it further.

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Aug 14, 2020 Aug 14, 2020

Copy link to clipboard

Copied

It includes an ifilter which is Microsoft's text extraction infrastructure.  No documentation from Adobe because it is an MS standard. 

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Aug 14, 2020 Aug 14, 2020

Copy link to clipboard

Copied

Thanks. What are the commands, executable from the command prompt, that use this ifilter? Arguments, results? My search application is part of my programming language Rapp. I need commands available without downloading more special programs. I recommend the users of Rapp to download MiKTeX, so pdf2ps and ps2ascii are available if they have downloaded it. Commands already present in Windows, if Reader is installed, would also work. If those are faster than pdf2ps and ps2ascii I would use them.

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Aug 14, 2020 Aug 14, 2020

Copy link to clipboard

Copied

This is not part of the command line world. Microsoft declared it dead 25 years ago. It's been slow-a-dying, but for the real meat in Windows you don't look for command lines but COM interfaces.

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Aug 15, 2020 Aug 15, 2020

Copy link to clipboard

Copied

LATEST

OK. It is above my skills to use this COM interface. But I have discovered that ps2ascii works also directly on pdf files, and that has simplified my application.

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines