Copy link to clipboard
Copied
Looking to convert lots of documents to text. Need meta data about the document and the text. One data point that is needed is the indent. Need to know if a line is indented from the left margin and some sort of quantification of that indent such as spaces, tabs or something else. If such a tool does not exist I am willing to code it but this new to me. What tools might help solve this problem?
This can be done using Acrobat JavaScript, but it's quite a complex task.
Copy link to clipboard
Copied
Do you need this data before or after the document is converted? We can only help with analyzing a PDF, i.e. after conversion.
Copy link to clipboard
Copied
After conversion is acceptable.
Copy link to clipboard
Copied
If you want to do this by hand, there is a Ruler and Grid in Acrobat. Look on the View->Show/Hide -> Rulers &Grids.
There is also a Measure Tool, look for it in the "Tools" tab.
This can also be done automatically with a script, but it's quite a job.
Copy link to clipboard
Copied
There's no way to measure the distance of a word from the margin of a page in spaces or tabs, because those don't exist in a PDF files as such. You can measure the physical distance in points and then convert it to inches, centimeters, etc.
Copy link to clipboard
Copied
What tools/technology are usfull for measuring the physical distance?
Copy link to clipboard
Copied
This can be done using Acrobat JavaScript, but it's quite a complex task.