Simpler extract PDF?
Hi, I am trying to extract all the text from a PDF. I was able to use ExtractPDFJob. The result is way more than I need. It was a large file with coordinates, bounds etc when all I need is the text itself and ideally the page number.
So something like
{ [ {page: "0", text: "Now is the time to live to the fullest"}, {page "1", etc}]
Is anything like this possible? I could of course maybe parse what is given to me now but worried about size of file as this processing is in an AWS Lamba
