Copy link to clipboard
Copied
Hi Community! Hope you are doing awesome. Can anybody help me with this code? I am trying to extract the contents of specific text frames in two areas which will never change, the info I need will be always there.
I know some people will tell me about python scrapper, exporting the pdf to xml and looking for the coordinates, i know all that, but I would like to know if is possible to do it directly. Here is the code:
var file = File.openDialog("Select a PDF file to extract text from");
if (file == null) {
alert("No file selected");
exit();
}
var pdfDoc = app.open(file);
alert(file);
var page = pdfDoc.pages[0];
var locationpdf1 = [115.2, 64.194, 238.36, 692.194];
var locationpdf2 = [415.6, 776.366, 529.582, 788.366];
var text1 = page.textFrames.add();
text1.visible = false;
text1.geometricBounds = locationpdf1;
var content1 = text1.contents;
var text2 = page.textFrames.add();
text2.visible = false;
text2.geometricBounds = locationpdf2;
var content2 = text2.contents;
var textFrame1 = app.activeDocument.textFrames.add();
textFrame1.contents = content1;
textFrame1.position = [100, 300]; // x, y
var textFrame2 = app.activeDocument.textFrames.add();
textFrame2.contents = content2;
textFrame2.position = [300, 300]; // x, y
alert(content1 + " " + content2)
I will appreciate your help, let me know if you need more info (no, i cant provide an example file, the whole pdf is confidential but anyhow the principle is the same, get data / copy data from specific areas in pdf)
Also here's a tip: when comparing numbers, rather than to compare (a === b), to instead do (Math.abs(a - b) < tolerance). This is because sometimes coordinates are only really meaningful to 3 decimal places in Illustrator, and sometimes even less depending on the process that generated them, but a equality comparison will fail even if the difference is less than 0.001.
- Mark
Copy link to clipboard
Copied
it doesn't' work like that, you can't add a text frame at a specific location and somehow get the contents of an existing text frame at that location.
One way of doing it is getting all text frames, then check their x,y location one at a time, if one matches your expected location then you found it, get it's contents.
Copy link to clipboard
Copied
Somehow I wanted to avoid AI to open the pdf file but I think it is totally necesary, let me try that way (It's going to take a while thou)
Copy link to clipboard
Copied
yes, if you have a lot of text items it might take a while.
another option, create a temp artboard at the location you expect your text to be, select all in artboard, at this point only one item should be selected if no overlapping occurs. That might be quicker.
Copy link to clipboard
Copied
working in somethign right now, I'll keep you posted
Copy link to clipboard
Copied
By the way, perhaps you could open your pdf in Illustrator and remove all the sensitive info (eg. deleting items and replacing text with dummy text) but be sure to leave the textframe that you are interested in (just change the text). That way you should be able to post a sample file. More people will want to help if they have a concrete example.
- Mark
Copy link to clipboard
Copied
Hi good afternoon! I was working on something already but still testing a few things, once I have something good ready I will post it, thank you!
Copy link to clipboard
Copied
Also here's a tip: when comparing numbers, rather than to compare (a === b), to instead do (Math.abs(a - b) < tolerance). This is because sometimes coordinates are only really meaningful to 3 decimal places in Illustrator, and sometimes even less depending on the process that generated them, but a equality comparison will fail even if the difference is less than 0.001.
- Mark
Copy link to clipboard
Copied
ohh wow I did not know about this!!! thank you so much! I am sure today I will have something to show you all.
Copy link to clipboard
Copied
Hi community! I got this:
var doc = app.activeDocument;
var contenido = "";
var x1 = Math.abs(489.5498046875);
var y1 = Math.abs(789.15380859375);
var x2 = Math.abs(415.599609375);
var y2 = Math.abs(789.16845703125);
for (var i = 0; i < doc.textFrames.length; i++) {
var textFrame = doc.textFrames[i];
var contents = textFrame.contents;
var x = Math.abs(textFrame.position[0]);
var y = Math.abs(textFrame.position[1]);
// Check if the current text frame matches the specified coordinates
if (x === x1 && y === y1 || x === x2 && y === y2) {
contenido += contents; // Concatenate the contents of matching text frames
}
}
but it is not concatenating the two numbers I need.... can I ask for some help please?
sample file attached.
(previous version was like:
var doc = app.activeDocument;
var contenido = "";
for (var i = 0; i < doc.textFrames.length; i++) {
var textFrame = doc.textFrames[i];
var contents = textFrame.contents;
var x = Math.abs(textFrame.position[0]);
var y = Math.abs(textFrame.position[1]);
alert("Contents: " + contents + "\nX: " + x + "\nY: " + y);
contenido = contents;
}
)
Copy link to clipboard
Copied
Artboard is set like this:
Find more inspiration, events, and resources on the new Adobe Community
Explore Now