Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

Bulk convert .DOT to PDF then count occurrences of words

New Here ,
Mar 19, 2023 Mar 19, 2023

Hi all

 

I am working within a project where there are approximately 2000 word template (dot) files. We are doing a re-branding and I am needing to identify which documents contain certain words/phrases so that we can scope the level of work to replace the brand, website addresses etc.

 

Word has very little in-built functionality so I've thought to look to:

a) Bulk convert these files to PDF

b) Use Adobe OCR or similar functionality to identify occurances of pre-identified phrases

 

The ideal output is a table such as below.

 

File Name

Brand XYZ

brandxyz.com

555 012 345

Doc 1

40

0

4

Doc 2

6

1

0

Doc 3

1

1

1

Doc 4

0

0

1

Doc 5

6

0

4

 

Is anybody aware of a way I could do this, please?

 

As a secondary goal, I would like to know if there's a capability to count instances of all words.  The idea here would be to try to identify occurances of phone numbers and other data that we are not expecting to see.  We might for example identify an old phone number or names of people on letters that no longer work for the company.

 

File Nameandthisthat

Doc 1

90

17

4

Doc 2

54

8

9

Doc 3

80

15

7

Doc 4

24

15

22

Doc 5

37

2

4

 

Thanks, Brendan

TOPICS
Edit and convert PDFs , How to , Scan documents and OCR
1.0K
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
1 ACCEPTED SOLUTION
Community Expert ,
Mar 20, 2023 Mar 20, 2023

This is possible, using a custom-made script. However, if you count all words it won't show you phone numbers, unless they are written as a single number. If a number is written like this "(23) 123 456 789" or "(23) 123-456-789", each part of it will show up as a separate word in the count. Only if it's written as "123456789" will it show up as a single word.

 

Anyway, this is possible, but not a trivial task. If you're interested in hiring a professional to create it for you, feel free to contact me privately by clicking my user-name and then on "Send a Message".

View solution in original post

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 20, 2023 Mar 20, 2023

This is possible, using a custom-made script. However, if you count all words it won't show you phone numbers, unless they are written as a single number. If a number is written like this "(23) 123 456 789" or "(23) 123-456-789", each part of it will show up as a separate word in the count. Only if it's written as "123456789" will it show up as a single word.

 

Anyway, this is possible, but not a trivial task. If you're interested in hiring a professional to create it for you, feel free to contact me privately by clicking my user-name and then on "Send a Message".

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Mar 28, 2023 Mar 28, 2023
LATEST

Thank you very much for your reply.  Was hoping it would be easier than this using Adobe but will revert back to a solution I found straight from the word docs.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines