Cannot find/replace text in pdf: some glyphs not recognizable
Hello,
I have a very large modifiable pdf that needs to be updated (replace product reference codes on over 1000 pages...).
I used qpdf to extract the streams and I was able to easily change about one third of the doc with sed since some of the text was in ascii. But the rest of the content is not easily readable and my last resort is to automate mouse and keyboard actions to find and replace each reference code. Unfortunately some content is not found by Acrobat Pro DC even though I can select each character, but if I try to copy/paste the content it is not recognized. I was able to see that the pdf as been created with indesign on a mac and unfortunately I cannot have the original file used to create it and I am on windows. The font used is century gothic. It is probably diffently encoded than the one I have on my computer.
I tried downloading different versions of the font but no luck. I also exported the pdf in tiff (600 pp resolution) and tried the OCR but even in that resolution it fails to recognize properly most of the text. I am stuck.
Does anyone have a solution ?