Copy link to clipboard
Copied
Hi all,
I've been trying to find a tool which I can use to find and replace some text in pdf files. We have Adobe Pro and we can totally do it manually but it will be very tedious and time consuming since we have thousands of files.
Is there a way we can do it programmatically? As far as I've known there is Adobe DC that has API that we can use, but rather than that, is there any libraries (C#, Python, etc) we can download and use it directly without calling APIs?
Thank you in advance,
D
Copy link to clipboard
Copied
Our current set of Document Services APIs do not have the ability to search for and then replace text in a PDF. You'll need a PDF library tool to do that sort of operation.
That said, it is very, very, unlikely that you will be able to accomplish your goal using just a PDF library tool. Because of the way that the PDF format stores text strings, you would only be able to replace the text with text that fits precisely in the same space. Adobe Acrobat is able to coalesce chunks of text and allow minor edits but if you've ever used the tool, you know that when the replacement causes a word wrap, the results can cause text to overlap other blocks of text in the document. Most PDF library tools do not interpret text blocks and would not even rewrap the text, it would just overlap the words around the replacement.
My recommendation is to recreate the PDF from source with the text replaced. If you don't have the source, Document Services APIs can be used to export your documents to common Office formats.
Copy link to clipboard
Copied
Thank you Joel, I really appreciate your detailed answer.
D
Copy link to clipboard
Copied
How did you manage to do this ? i converted my pdf to docx but how can i replace the text in the docx?
Copy link to clipboard
Copied
You'd need to use the Word API to do that.
Copy link to clipboard
Copied
so it cant be done using adobe pdf services, im trying to remove dependencys from Microsoft office