• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Scrape PDF based on text criteria

New Here ,
Jun 17, 2024 Jun 17, 2024

Copy link to clipboard

Copied

I have a document set of about 900 PDFs. They are structured like court documents, with the Title of each document "Motion in Limine to Exclude Evidence, Testimony and Reference ...." in a table on page 1 of each document.

 

I need a way to find instances of "Motion in Limine" or "MIL" on page 1 of any of the PDFs that are in this file directory. Next, return the next 50 words in the title, so I can see what the "Motion in Limine" was about and who filed it. Lastly, give me the date this document was "e-served" which is always date-stamped at the top of the PDF. Spit all this out into a clean delimited file.

 

Any suggestions for how I go about this?

TOPICS
How to , PDF

Views

51

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jun 18, 2024 Jun 18, 2024

Copy link to clipboard

Copied

LATEST

This is possible, but will require quite a complex, custom-made script, probably in combination with an Action (to locate those texts initially, since scanning such a large file will probably be too much for a script to hand on its own).

 

I've developed similar tools for my clients in the past and would be happy to create one for you as well (for a fee, of course). Feel free to contact me privately by clicking my user-name and then on "Send a Message" to discuss it further.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines