Copy link to clipboard
Copied
I receive text documents from word which are a mess. I need to clean them before they go into the next stage of my workflow. I have a great script which cleans all of the text and it works fine, except for one thing – one of the characters in the copy is a square bullet. This square bullet needs to have 2 spaces preceding it in the final text. This snippet finds and replaces the square bullet with a unique symbol perfectly.
First Script
var search_string = / ▪/gi; // g for global search, remove i to make a case sensitive search
var replace_string = " Ω";
var text_frames = active_doc.textFrames;
if (text_frames.length > 0)
{
for (var i = 0 ; i < text_frames.length; i++)
{
var this_text_frame = text_frames;
var new_string = this_text_frame.contents.replace(search_string, replace_string);
if (new_string != this_text_frame.contents)
{
this_text_frame.contents = new_string;
}
}
}
The reason I chose to replace it with the Ω symbol is because it's likely not ever going to be found in the variety of word documents which I will be receiving.
The problem is that the square bullet in the final product needs to be produced in a specific font, Zapf Dingbats. The keystroke which is required to produce the square bullet in that font is a lowercase ‘n’. I thought I could find the Ω and replace it with a lowercase ‘n’ following another script I use which finds a specific character and applies a character style to it. Here is that script:
Second Script
var main = function() {
var doc;
if ( app.documents.length > 0 && app.activeDocument.textFrames.length > 0 ) {
// Set the value of the character to look for
searchCharacter1 = "Ω";
// Iterate through all characters in the document
// and color the character that match searchWord
for ( i = 0; i < app.activeDocument.textFrames.length; i++ ) {
textArt = activeDocument.textFrames;
for ( j = 0; j < textArt.characters.length; j++) {
character = textArt.characters
if ( character.contents == searchCharacter1 ) {
// character.filled = true;
// character.fillColor = CharacterColor;
app.activeDocument.characterStyles.getByName ( "Square Bullet" ).applyTo ( character, true );
}
}
}
}
}
var u;
main();
var active_doc = app.activeDocument;
var search_string = /Ω/gi; // g for global search, remove i to make a case sensitive search
var replace_string = "n";
var text_frames = active_doc.textFrames;
if (text_frames.length > 0)
{
for (var i = 0 ; i < text_frames.length; i++)
{
var this_text_frame = text_frames;
var new_string = this_text_frame.contents.replace(search_string, replace_string);
if (new_string != this_text_frame.contents)
{
this_text_frame.contents = new_string;
}
}
}
The problem is, when I run the second script it successfully changes the character but it loses the character style?
I tried searchWord for " n", but javascript doesn't like the spaces, yes I tried "\s\sn" and I tried "\u0020\u0020n". I can't replace with just an ‘n’ because then it would find every ‘n’ in the copy and replace them with square bullets.
All I want to do is clean the original text so I get "space space square bullet", then replace the square bullet with ‘n‘, then format just the ‘n’ with a character style named Square Bullet.
All of these scripts, which I have gratefully received from members of this forum, are set to execute throughout an entire document. Is there a way to restrict the execution of the scripts to a specific layer ONLY.
One last thing… I have read through the Illustrator Scripting Guide. I have skimmed through the Illustrator Scripting Reference - JavaScript and I have viewed the first four units of the JavaScript Essential Training course from Lynda.com. I'm kinda getting this but not fast enough. Any assistance would be helpful.
Marcrest
Copy link to clipboard
Copied
That is a bunch to follow, can you post an example of screenshots to help visually aid? To restrict to one layer, you have to have your layer designated - so something like doc.layers[0], or doc.layers.getByName("My Layer"). Each of these have a collection of top-level art objects, including textFrames. Use doc.layers[0].textFrames to get the top-level text frames on the first layer. To get nested layers which are inside of groups, you'd have to have some functions to get into the groups and go through their textFrames and groups, and so on.
Copy link to clipboard
Copied
Sorry for the lengthy post. Your comments regarding the layers are great. Message received. Fortunately I have constructed my files with this type of functionality in mind. The element which I am creating works perfectly within these parameters. I have a layer which the element resides on… "Chart". Although there are multiple elements on that layer, there are on 2 which are text frames (although each text frame may have additional linked text frames, associated with them). Now I just need to to figure out how to add them to the codes I already have : )
Here's a screen capture. Hope it helps understand my issues.
Thanks Silly-V.
Copy link to clipboard
Copied
Hi, your script replace whole contents of the textFrame, so the style will become all the same as first character.
And no need to take 2 steps, changing contents and applying style can be done in one go. Try this:
var search_string = / ▪/gi; // g for global search, remove i to make a case sensitive search
var replace_string = " n";
var active_doc = app.activeDocument;
var text_frames = active_doc.textFrames;
if (text_frames.length > 0) {
for (var i = 0; i < text_frames.length; i++) {
var this_text_frame = text_frames;
var match = search_string.exec('');
while (match = search_string.exec(this_text_frame.contents)) {
this_text_frame.characters[match.index + 1].contents = replace_string; // replace "▪" with " n"
active_doc.characterStyles.getByName("Square Bullet")
.applyTo(this_text_frame.characters[match.index + 2], true); // only apply to "n", leave space
}
}
}
Copy link to clipboard
Copied
moluapple this is great. Thank you. The problem is that for my current process I must to clean the incoming text and then apply the character style. I can't combine them. Let me explain – I receive a copy deck from my client. I clean the text in one document and then format it in another. I do this in two steps because executing it all in one script, although possible, is simply WAY beyond my capabilities. I have twelve unique paragraph styles and fourteen character styles which need to be applied to two languages in two different formats and while it would be truly incredible to create a script which handled everything. I'm just not sure, being the novice that I am, that I can create the script in the time frame which I have available (a week or so).
Could I ask you to split your recent code into 2 pieces, one which searches and replaces and the other which applies the format. I tried splitting it into 2 pieces without success. I love the way you restricted the text to one text frame, but would prefer it run on a single layer, "Chart" instead. As I mentioned in my last post, my Chart layer will only have 2 text boxes on it (one english and one french). There may be more than 2 in the sense that there may be linked "children" boxes for each of the 2 parent boxes."Gees, I'm even starting to sound like I know this stuff – parent and children… ha".
Qwertyfly... and Silly-V are listening – I finished my Lynda.com training. I created my own script yesterday, which did what I wanted it to do – YAY! AND I understood what each line did! – I am learning!
To all who have helped me so far, I am truly thankful. I feel bad picking your brains and copying your code… Are any of you interested in assisting with the development of this "bigger" script. My understanding of JavaScript so far, tells me that one script is possible. And although I am making progress with my own learnings, I do recognize that this is beyond what I can do in my available timeframe. Financial consideration is not out of the question.
Copy link to clipboard
Copied
One question, since there may be exists linked textframes, so one textframe can end with a space, and next textframe can start with ▪, right? If that can happen, it's more safe to deal with "Story" instead of "TextFrame".
So 1:
var search_string = / ▪/gi; // g for global search, remove i to make a case sensitive search
var replace_string = " n"; // " n" works, so no need Ω
var active_doc = app.activeDocument;
var stories = active_doc.stories;
var chart_layer = active_doc.layers.getByName('Chart');
for (var i = 0; i < stories.length; i++) {
var this_story = stories;
if (this_story.textFrames[0].parent == chart_layer) { // "Chart" layer only
var new_string = this_story.textRange.contents.replace(search_string, replace_string);
if (new_string != this_story.textRange.contents) {
this_story.textRange.contents = new_string;
}
}
}
2:
var search_string = / n/gi; // g for global search, remove i to make a case sensitive search
var active_doc = app.activeDocument;
var stories = active_doc.stories;
var chart_layer = active_doc.layers.getByName('Chart');
for (var i = 0; i < stories.length; i++) {
var this_story = stories;
if (this_story.textFrames[0].parent == chart_layer) {
var match = search_string.exec('');
while (match = search_string.exec(this_story.textRange.contents)) {
active_doc.characterStyles.getByName("Square Bullet")
.applyTo(this_story.textRange.characters[match.index + 2], true); // only apply to "n"
}
}
}
Copy link to clipboard
Copied
@moluapple thank you again this has been really helpful. I have adjusted my current script to use stories instead. Your method regarding text boxes splitting the text made sense. I am however having a problem with executing the change from search strings to search stories. See below:
snippet A:
var search_string = / \r/gi; // g for global search, remove i to make a case sensitive search
var replace_string = " \r";
var text_frames = active_doc.textFrames;
if (text_frames.length > 0)
{
for (var i = 0 ; i < text_frames.length; i++)
{
var this_text_frame = text_frames;
var new_string = this_text_frame.contents.replace(search_string, replace_string);
if (new_string != this_text_frame.contents)
{
this_text_frame.contents = new_string;
}
}
}
This code works but it isn't limited to my "chart" layer.
I tried to convert your snippet, but for some reason it's not working.
snippet B:
// space space space hard return gets replaced with space space hard return
var search_string = / \r/gi; // g for global search, remove i to make a case sensitive search
var replace_string = " \r";
var active_doc = app.activeDocument;
var text_frames = active_doc.textFrames;
var chart_layer = active_doc.layers.getByName('Chart');
if (text_frames.length > 0) {
for (var i = 0; i < text_frames.length; i++) {
var this_text_frame = text_frames;
if (this_text_frame.textFrames[0].parent == Chart_layer) { // "Chart" layer only
var new_string = this_text_frame.textRange.contents.replace(search_string, replace_string);
if (new_string != this_text_frame.textRange.contents) {
this_text_frame.textRange.contents = new_string;
}
}
}
}
It says undefined is not an object. I figured it was the \r characters which were throwing off the "stories" so I switched them out for "text_frames". What am I missing?
Everything else worked perfectly in the Search and Replace… Now I just have to check the Search and Apply script : ) Thanks again everyone for your help with this. I think I can get it done : )
Copy link to clipboard
Copied
AAARRGG!
This logic is escaping me. I thought I could copy and repeat the bullet script for other text strings, like this one:
snippet C: searching the word "Title", applying a different character style to it, "Title". But instead of having it change only the bullet, or the character in the third position (2), I thought I should replace it with nothing (or all?), then I thought it should be 0, then I thought it should count all the letters of the word Title (4). Wrong again honey!
// Title – this doesn't work
var search_string = /Title/g; // g for global search, remove i to make a case sensitive search
var active_doc = app.activeDocument;
var stories = active_doc.stories;
var chart_layer = active_doc.layers.getByName('Chart');
for (var i = 0; i < stories.length; i++) {
var this_story = stories;
if (this_story.textFrames[0].parent == chart_layer) {
var match = search_string.exec('');
while (match = search_string.exec(this_story.textRange.contents)) {
active_doc.characterStyles.getByName("Title")
.applyTo(this_story.textRange.characters[4], true); // only apply to "Title"
}
}
}
snippet D: Again, I thought I could amend the bullet script to apply a paragraph style but I ran into a road block because I don't know how to define "all the text" in the text box? I guess the *.* thing didn't work. So much for my old DOS training. Yes I also trained on punch cards!
// All text into paragraph style "body copy" – This doesn't work
var search_string = /\*.\*/g; // g for global search, remove i to make a case sensitive search
var active_doc = app.activeDocument;
var stories = active_doc.stories;
var chart_layer = active_doc.layers.getByName('Chart');
for (var i = 0; i < stories.length; i++) {
var this_story = stories;
if (this_story.textFrames[0].parent == chart_layer) {
var match = search_string.exec('');
while (match = search_string.exec(this_story.textRange.contents)) {
active_doc.paragraphStyles.getByName("body copy")
.applyTo(this_story.textRange.paragraphs[0], true); // apply to "all copy" in text_frame
}
}
}
By the way, can you explain the search.string.exec(''); It's really the .exec part that I don't understand.
Seriously we are so close.
Copy link to clipboard
Copied
Also… Is there a way to search a variable number of characters? i.e.. within the client copy deck there are 2 lines which go like this:
g................................P, but the number of dots changes. Sometimes it's 4, sometimes it's 10. Obviously this is someone who doesn't know how to use tabs. I need to replace this variable string with space tab space or \s\t\s. Any thoughts?
If I can solve that I should be able to solve the space space space space space space issue where the number of spaces change. In that case I have to change to the same thing \s\t\s.
Copy link to clipboard
Copied
1.
g................................P, but the number of dots changes. Sometimes it's 4, sometimes it's 10.
Well, please try google "Regular expression", or go to this link:Regular Expressions Quick Start .
The regex here can be: /\.{2,}/g
2. You mean this didn't work? But I tested and it works.
var search_string = / \r/gi; // g for global search, remove i to make a case sensitive search
var replace_string = " \r";
var active_doc = app.activeDocument;
var stories = active_doc.stories;
var chart_layer = active_doc.layers.getByName('Chart');
for (var i = 0; i < stories.length; i++) {
var this_story = stories;
if (this_story.textFrames[0].parent == chart_layer) { // "Chart" layer only
var new_string = this_story.textRange.contents.replace(search_string, replace_string);
if (new_string != this_story.textRange.contents) {
this_story.textRange.contents = new_string;
}
}
}
3. Apply "Title" style
var search_string = /Title/g;
var active_doc = app.activeDocument;
var stories = active_doc.stories;
var chart_layer = active_doc.layers.getByName('Chart');
for (var i = 0; i < stories.length; i++) {
var this_story = stories;
if (this_story.textFrames[0].parent == chart_layer) {
var match = search_string.exec('');
while (match = search_string.exec(this_story.textRange.contents)) {
var match_chars = this_story.textRange.characters[match.index]; // match_chars = "T"
match_chars.length = match[0].length; // now match_chars = "Title"
active_doc.characterStyles.getByName("Title").applyTo(match_chars, true);
}
}
}
4. All text into paragraph style "body copy", so there is no need to search, just apply it to the TextRange
var active_doc = app.activeDocument;
var chart_txts = active_doc.layers.getByName('Chart').textFrames; // texts on Chart layer only
for (var i = 0; i < chart_txts.length; i++) {
active_doc.paragraphStyles.getByName("body copy")
.applyTo(chart_txts.textRange, true);
}
Copy link to clipboard
Copied
moluapple these scripts are working out almost perfectly. Thank you again. In response to the above… #1 worked great. #2 I didn't have had ExtendScript Toolkit set to Illustrator and that's why it wasn't working : ) #3 worked perfectly. I now have a cleaning script which cleans my incoming copy perfectly. Unfortunately my formatting script is still a little wobbly. Here are my problems:
#1 Within my chart I have sections, which are divided by lines [created by tabs (\t\r)]. Easy to search for and replace with a paragraph style, but the problem is each section has sub-sections within which are divided by more lines. I thought I could, in my cleaning script, identify these inner lines as (\t\t\r) which would make them unique. Search (\t\t\r) and apply a different paragraph style and voila. But the problem is my script is getting confused. It is only applying 1 paragraph style, the first one. Thoughts? This problem manifests itself again at another spot where I have two lines with \s\t\s. In the first instance it is to apply a paragraph style to a tab which has NO fill character and in the second instance it is to apply a second paragraph style which DOES have a fill character.
#2 The previously provided scripts for the bullet searches aren't working with the larger more complete scripts. It's because those were developed prior to the Chart layer exclusivity. I know it's because they are searching for strings and now that the cleaning has been done, I just need it to search for characters. See below
// Bullets – this doesn't work
var search_character1 = "•";
var active_doc = app.activeDocument;
var stories = active_doc.stories;
var chart_layer = active_doc.layers.getByName('Chart');
for (var i = 0; i < stories.length; i++) {
var this_story = stories;
if (this_story.textFrames[0].parent == chart_layer) {
var match = search_string.exec('');
while (match = search_string.exec(this_text_frame.contents)) {
this_text_frame.characters.contents = searchCharacter1
active_doc.characterStyles.getByName("Bullet")
.applyTo(characters, true);
}
}
}
#3 I need to search through the document for a specific word. Easy. The problem is there are multiple locations for that single word (6 in total). I need to apply a character style to the first instance and the fifth instance. This happens with a string as well, but it only finds the string twice.
#4 Is there a way to search for a string and then apply a character style to only the first part of the string? (example: "wrench\s\t\sprice" or "screwdriver\s\t\ssize") If i wanted to apply a character style to the word "wrench" and "screwdriver" but not to everything that follows. Can I search the string "wrench\s\t\s" and then apply the character style to only what precedes the "\s\t\s"?
I have been reviewing the regular Expressions Quick Start and another from Mozilla. Very cool that you can do all this stuff. A friend of mine has been doing web stuff for years and I never thought it applied to Illustrator. Cool. More questions tomorrow I'm sure, but I'm almost done. With these questions, I'll be 90%, maybe more. Thanks to all.
Copy link to clipboard
Copied
#1. Sorry, I need a real world example to understand the problem.
#2. You can just modify the Apply "Title" style code sinppet:
- replace /Title/g with /•/g
- replace characterStyles.getByName("Title") with characterStyles.getByName("Bullet")
then it should work.
#3. If the specific word are not all occur in the same story, it's hard to tell which one is first, which one is last. So below code may make mistake, but the concept is there.
var search_string = /specific_word/g;
var active_doc = app.activeDocument;
var stories = active_doc.stories;
var j = 0;
for (var i = 0; i < stories.length; i++) {
var this_story = stories;
var match = search_string.exec('');
while (match = search_string.exec(this_story.textRange.contents)) {
j++;
if (j == 1 || j == 5) {
var match_chars = this_story.textRange.characters[match.index];
match_chars.length = match[0].length;
active_doc.characterStyles.getByName("specific word").applyTo(match_chars, true);
}
}
}
#4. The regex for you: /wrench(?=\s\t\s)/ , see lookaround part of the quick start.
Copy link to clipboard
Copied
Regarding #1. Sorry it is a rather complex idea. A good real world example would be a nutrition chart, where there are thick lines represented by a \t and thin lines represented by a \t\t and then there are the right aligned tabs represented by a \s\t\s's between the g weight and the % value.I'm not sure if you can understand this visual, but it's a try. Currently when I search \t and or any string which I thought would make each different tab unique, it applies one paragraph style to ALL of them. Here's an example of my "cleaned" script showing the \t's as », in purple. Sorry, I had to blur out the rest of the text for confidentiality.
Hope this helps.
Copy link to clipboard
Copied
Ok, that make sense.
- Thick rule: /^\t\r/
- Thin rule: /^\t\t\r/
- Well, a bit complex for these paragraphs which need apply different style to each of them, you shoule explan diffenece between them more clearly, and the rule for how to apply different style for them(eg: one for the first paragraph, and another for rest, etc.).
/^[^\t]*\t[^\t]*\r/ will match all of them.
Copy link to clipboard
Copied
moluapple we're in the final stretch. I have made terrific progress with so much, but these last four issues remain.
#1 Using the regex quick start you provided a link to I was able to create the following (i.e. Do not use(?= \•with any other fuel source additives). This is great as I needed to apply a bold style to "Do not use", but that statement appears elsewhere in the copy. Here's the thing – the regex quick start say that this should work for a backwards look around (i.e.(?<=Do not use \•with any other fuel source )additives) should bold the "additives", but it doesn't find the reference? Again, the word additives elsewhere in the copy. What am I doing wrong?
#2 These pesky "\t" are still not working…
// Thick rules
var search_string = /^\t\r/g;
var active_doc = app.activeDocument;
var stories = active_doc.stories;
var chart_layer = active_doc.layers.getByName('Chart');
for (var i = 0; i < stories.length; i++) {
var this_story = stories;
if (this_story.textFrames[0].parent == chart_layer) {
var match = search_string.exec('');
while (match = search_string.exec(this_story.textRange.contents)) {
var match_chars = this_story.textRange.characters[match.index]; // match_chars = "T"
match_chars.length = match[0].length; // now match_chars = "Title"
active_doc.paragraphStyles.getByName("Thick rule_FLX").applyTo(match_chars, true);
}
}
}
What am I missing?
#3 The square bullets which we figured out previously were never restricted to the Chart layer. This is what I have. See my code comments (//)
// Square bullets – this includes the restriction to the chart layer but doesn't apply the character style to the 3rd character of the match
var search_string = / n)/g;
var active_doc = app.activeDocument;
var stories = active_doc.stories;
var chart_layer = active_doc.layers.getByName('Chart');
for (var i = 0; i < stories.length; i++) {
var this_story = stories;
if (this_story.textFrames[0].parent == chart_layer) {
var match = search_string.exec('');
while (match = search_string.exec(this_story.textRange.contents)) {
var match_chars = this_story.textRange.characters[match.index]; // match_chars = "T"
match_chars.length = match[0].length; // now match_chars = "Title"
active_doc.characterStyles.getByName("Square Bullet").applyTo(match_chars, true);
}
}
}
// Square bullets - this does everything except restrict the search to the chart layer
var search_string = / n/gi; // g for global search, remove i to make a case sensitive search
var active_doc = app.activeDocument;
var text_frames = active_doc.textFrames;
if (text_frames.length > 0) {
for (var i = 0; i < text_frames.length; i++) {
var this_text_frame = text_frames;
var match = search_string.exec('');
while (match = search_string.exec(this_text_frame.contents)) {
this_text_frame.characters[match.index + 1].contents = replace_string; // replace "▪" with " n"
active_doc.characterStyles.getByName("Square bullet")
.applyTo(this_text_frame.characters[match.index + 2], true); // only apply to "n", leave space
}
}
}
I obviously am having trouble with the "while" statement and how it works?
#4 Related to the "\t" issue I have a commonly formatted string which has different information but appears over many lines within the copy deck. This common string references a product, then a weight reference, then the space tab space (filled with character) and then a price. Like this…
Fuel additive 500 mg..................................$12.00
Oil supplement 300 mg................................$3.00
etc.
I tried this regex line. "var search_string = / ˆ.+ mg \t .+$/g;" I thought it would select everything before the "mg" and after the "\t" and apply the paragraph style to it, on the line which it finds that information on, but it is applying the style to the entire text frame? Is there a way to limit it?
I am so happy for the help you have provided so far. Down to the wire now.
Cheers, Marcrest
Copy link to clipboard
Copied
#1 That's because javascript doesn't have regex lookbehind.
In this case you can search /(Do not use •with any other fuel source )(additives)/g, and only format the match group 2. Do not need to add "\" before "•".
var search_string = /(Do not use •with any other fuel source )(additives)/g;
var active_doc = app.activeDocument;
var stories = active_doc.stories;
var chart_layer = active_doc.layers.getByName('Chart');
for (var i = 0; i < stories.length; i++) {
var this_story = stories;
if (this_story.textFrames[0].parent == chart_layer) {
var match = search_string.exec('');
while (match = search_string.exec(this_story.textRange.contents)) {
var match_chars = this_story.textRange.characters[match.index + match[1].length];
match_chars.length = match[2].length;
active_doc.characterStyles.getByName("Bold").applyTo(match_chars, true);
}
}
}
#2 Well, apply paragraphStyle is different from characterStyle, see below.
var search_string = /^\t$/; //please replace \r with $, as the contents of a paragraph do not contain \r
var active_doc = app.activeDocument;
var stories = active_doc.stories;
var chart_layer = active_doc.layers.getByName('Chart');
for (var i = 0; i < stories.length; i++) {
var this_story = stories;
if (this_story.textFrames[0].parent == chart_layer) {
for (var j = 0; j < this_story.paragraphs.length; j++) {
var match_para = this_story.paragraphs
; if (search_string.test(match_para.contents)) {
active_doc.paragraphStyles.getByName("Thick rule_FLX").applyTo(match_para, true);
}
}
}
}
#3 ")" shoule be escaped in the regex.
var search_string = / n\)/g;
#4 See #2.
Copy link to clipboard
Copied
Qwertyfly...Silly-V @moluapple I know I haven't posted in awhile and I wanted to provide an update and say "THANK YOU" to all who assisted me with the development of my original script. Originally, I had hoped to create this myself, but I had to give it up to an internal programmer within my office to make it more practical and efficient. He and I have worked together and completed our original v.1. He knew nothing about Illustrator's scripting abilities and I (we) have opened his eyes to a whole new world. Our script is now running on two team's machines (4 people) and we are enjoying a 98% time savings on the more tedious part of our task. That represents more than 1000 person hours with respect to the entire project. Your insights and support helped make this happen. Can't say enough about this community - YOU FOLKS ROCK!