Accurately Extract data from US Tax Forms
I need to extract information from a PDF of a US Corporate Income Tax Return (aka Form 1120 returns) to Excel.
Are there any features in the Adobe suite that would allow me to perform this conversion accurately? OCR does not appear to be very reliable given the complexity of these forms. Similarly, exporting to Excel has problems with merged columns and checkboxes.
The IRS publishes various fake US 1120 returns. Take a look at the relatively standard/low to moderate complexity return. https://www.irs.gov/pub/irs-wi/ty24-f1120-ats-scenario-4.pdf Try extracting info from the table labeled Schedule L on page 9/64 of the adobe document.
Thank you,
Pablo
