Copy link to clipboard
Copied
Dear community,
I have hundreds of PDFs that I'm converting to text and feeding into text-to-audio program. 90% of them have vertical text fields on every page (e.g., "This PDF was generated by... and downloaded from... at ... ") which messes up the conversion and text-to-audio horrendously. Is there a way to batch remove all vertical text from PDFs? All solutions are welcome, including codes in Python, VBA and R.
Thank you for any info/advice
Copy link to clipboard
Copied
There's no such thing as "vertical text" in a PDF. Also, there's no support for any of these languages in Acrobat.
It might be possible to do it using JavaScript, if the text can be identified based on its location or contents, but it's impossible to say for sure without seeing the actual files.
Copy link to clipboard
Copied
Here is the link to the first page of such PDFs. I'm not familiar with the PDF internal design, so the box ion the right looks like vertical text to me, even though it's a 90 degree clockwise rotation.
Dropbox - Meneghetti and Williams - 2017ysis - Fortune Favors the Bold 1.pdf
As for python and R, tehre are libraries that read PDFs, adn there are also libraries that save PDFs, hence my question.
Copy link to clipboard
Copied
I didn't say it can't be done with those languages, just that they can't be used in Acrobat, which is the subject of this forum.
You can do it in Acrobat (Pro) using the Redaction tool, which is located under Tools - Protection. Draw a redaction area around the text on the first page, then right-click it and select "Repeat mark across pages". Then apply the redactions and you're done.
This process can also be automated using JavaScript and incorporated into an Action, to process multiple files at once.
If you're interested I could write this code for you, for a small fee. You can contact me privately via try6767 at gmail.com to discuss it further.
Copy link to clipboard
Copied
Does the terms of use allow this?
Copy link to clipboard
Copied
If the sole purpose is to convert the files to text-to-audo, could you solve the problem by cropping the pages to remove the vertical text? I know that a piece of software like Pitstop allows you delete objects outside certain dimensions of a PDF page.
Find more inspiration, events, and resources on the new Adobe Community
Explore Now