Copy link to clipboard
Copied
I'm not much of a gun (at all) at regular expressions.
I need to replace everything between two comments with nothing.
I am trying to resurrect and clean up some HTML for someone whose site is only now available via the Wayback Machine.
So, I have lots (hundreds) of files with:
<!-- BEGIN WAYBACK TOOLBAR INSERT -->
I want to replace all of this HTML and, of course, the comments themselves
<!-- END WAYBACK TOOLBAR INSERT -->
I can do search & replaces on most other stuff but each of the comment sections is slightly different.
Is this possible? Thanks!
I'm using DW CS6 on a Mac.
Copy link to clipboard
Copied
If all of the comments are identical to what you put above, this should do the trick...
1. With one of your pages open, open the Find & Replace tool
2. Set the Find In dropdown to Current Document
3. Put this in the Find field...
<!-- BEGIN WAYBACK TOOLBAR INSERT -->
(.)*?
<!-- END WAYBACK TOOLBAR INSERT -->
4. Verify that the Replace field is blank
5. Make sure the Use Regular Expression box is checked
6. Hit Replace All
If it tests correctly on your open document, change the Find In dropdown to Entire Current Local Site and hit Replace All again.
Copy link to clipboard
Copied
Thank you, Jon.
Unfortunately, nothing happened. I even tried just using the (.)*? on its own (because I could always not save, if the obliteration was total), but in both cases, DW returned "Done. Not found in the current document."
It's a weird one. But thank you!
Copy link to clipboard
Copied
Not sure if you can do/find it, but don't forget to use the multi-line flag. That might be affecting the results.
V/r,
^ _ ^
Copy link to clipboard
Copied
Thanks for the suggestion, WolfShade but still no go (although I could totally have the regular expression wrong).
So far, I've tried:
<!-- BEGIN WAYBACK TOOLBAR INSERT -->
(.)*?
<!-- END WAYBACK TOOLBAR INSERT -->
and
<!-- BEGIN WAYBACK TOOLBAR INSERT -->
(.)*? V/r,
<!-- END WAYBACK TOOLBAR INSERT -->
Even
(.)*? and/or (.)*?V/r, (on their own)
as Find & Replace as Regular Expressions and it's not working for me. I feel that I'm doing something wrong (but I did double and triple check I've selected "Use regular expression").
I even tried both/all as text (rather than source code).
Thank you both for your help.
Copy link to clipboard
Copied
I'm a bit confused. Why are you adding my "V/r" to the mix? That is for "Very respectfully", how I usually end my post, just before my "signature".
V/r,
^ _ ^
Copy link to clipboard
Copied
Had I not known you use V/r in the signature of all of your posts, I might have assumed it was some kind of multi-line flag that you mentioned in your answer that needed to be tacked onto the RegEx.
Copy link to clipboard
Copied
How many times doe the Wayback comment pair appear in a given page?
Copy link to clipboard
Copied
If there's only one set of those comments per page, and EVERYTHING between them is junk, you should be able to use this...
<!-- BEGIN WAYBACK TOOLBAR INSERT -->
(.+)((\s)+(.+))+
<!-- END WAYBACK TOOLBAR INSERT -->
The space between the opening and closing comments, if any, matters to DW. So if the comments have line breaks between them and the content, it'll need to be there in the Find field too.
<!-- BEGIN WAYBACK TOOLBAR INSERT -->
(.+)((\s)+(.+))+
<!-- END WAYBACK TOOLBAR INSERT -->
...also noteworthy, my version of DW (CC2015) doesn't like working with RegEx for very long and requires a program restart every now and again or it stops finding matches.
Copy link to clipboard
Copied
I THINK I FOUND IT! (I love a good RegEx challenge.)
FIND: (<!-- BEGIN WAYBACK TOOLBAR INSERT -->(\s)+)([^<])+(<!-- END WAYBACK TOOLBAR INSERT -->)
| | | || | | |
| $2 || | | |
------------------ $1 --------------------- -$3- -------------- $4 -----------------
REPLACE: $1$2$2$4
This will keep the line break/carriage returns between comments. (Lines 3-5 are just for clarity and should not be used.)
V/r,
^ _ ^
Copy link to clipboard
Copied
Well, I am embarrassed by adding WolfShake's sig to my efforts*. I may (just a bit) be very literal.
Thank you both for being patient with me (I swear, I can "read" Perl, some ASP, a little Cold Fusion but I just do not understand regular expressions at all. It's like a mental block).
I tried (after restarting DW):
<!-- BEGIN WAYBACK TOOLBAR INSERT -->
(.+)((\s)+(.+))+
<!-- END WAYBACK TOOLBAR INSERT -->
And no joy. I have to be doing something wrong. And believe me, I could be wrong at the Olympics.
Here's an example of what I'm trying to get rid of.
<!-- BEGIN WAYBACK TOOLBAR INSERT -->
<script type="text/javascript" src="/_static/js/timestamp.js" charset="utf-8"></script>
<script type="text/javascript" src="/_static/js/graph-calc.js" charset="utf-8"></script>
<script type="text/javascript" src="/_static/js/auto-complete.js" charset="utf-8"></script>
<script type="text/javascript" src="/_static/js/toolbar.js" charset="utf-8"></script>
<style type="text/css">
body {
margin-top:0 !important;
padding-top:0 !important;
/*min-width:800px !important;*/
}
.wb-autocomplete-suggestions {
text-align: left; cursor: default; border: 1px solid #ccc; border-top: 0; background: #fff; box-shadow: -1px 1px 3px rgba(0,0,0,.1);
position: absolute; display: none; z-index: 2147483647; max-height: 254px; overflow: hidden; overflow-y: auto; box-sizing: border-box;
}
.wb-autocomplete-suggestion { position: relative; padding: 0 .6em; line-height: 23px; white-space: nowrap; overflow: hidden; text-overflow: ellipsis; font-size: 1.02em; color: #333; }
.wb-autocomplete-suggestion b { font-weight: bold; }
.wb-autocomplete-suggestion.selected { background: #f0f0f0; }
</style>
<div id="wm-ipp-base" lang="en" style="display:none;direction:ltr;">
<div id="wm-ipp" style="position:fixed;left:0;top:0;right:0;">
<div id="wm-ipp-inside">
<div style="position:relative;">
<div id="wm-logo" style="float:left;width:130px;padding-top:10px;">
<a href="/web/" title="Wayback Machine home page"><img src="/_static/images/toolbar/wayback-toolbar-logo.png" alt="Wayback Machine" width="110" height="39" border="0" /></a>
</div>
<div class="r" style="float:right;">
<div id="wm-btns" style="text-align:right;height:25px;">
<div id="wm-save-snapshot-success">success</div>
<div id="wm-save-snapshot-fail">fail</div>
<a href="#"
onclick="__wm.saveSnapshot('http://therecreationclub.com/', '20120824012829')"
title="Share via My Web Archive"
id="wm-save-snapshot-open"
>
<span class="iconochive-web"></span>
</a>
<a href="https://archive.org/account/login.php"
title="Sign In"
id="wm-sign-in"
>
<span class="iconochive-person"></span>
</a>
<span id="wm-save-snapshot-in-progress" class="iconochive-web"></span>
<a href="http://faq.web.archive.org/" title="Get some help using the Wayback Machine" style="top:-6px;"><span class="iconochive-question" style="color:rgb(87,186,244);font-size:160%;"></span></a>
<a id="wm-tb-close" href="#close" onclick="__wm.h(event);return false;" style="top:-2px;" title="Close the toolbar"><span class="iconochive-remove-circle" style="color:#888888;font-size:240%;"></span></a>
</div>
<div id="wm-share" style="text-align:right;">
<a href="#" onclick="window.open('https://www.facebook.com/sharer/sharer.php?u=http://web.archive.org/web/20120824012829/http://therec...', '', 'height=400,width=600'); return false;" title="Share on Facebook" style="margin-right:5px;" target="_blank"><span class="iconochive-facebook" style="color:#3b5998;font-size:160%;"></span></a>
<a href="#" onclick="window.open('https://twitter.com/intent/tweet?text=http://web.archive.org/web/20120824012829/http://therecreation...', '', 'height=400,width=600'); return false;" title="Share on Twitter" style="margin-right:5px;" target="_blank"><span class="iconochive-twitter" style="color:#1dcaff;font-size:160%;"></span></a>
</div>
</div>
<table class="c" style="">
<tbody>
<tr>
<td class="u" colspan="2">
<form target="_top" method="get" action="/web/submit" name="wmtb" id="wmtb"><input type="text" name="url" id="wmtbURL" value="http://therecreationclub.com/" onfocus="this.focus();this.select();" /><input type="hidden" name="type" value="replay" /><input type="hidden" name="date" value="20120824012829" /><input type="submit" value="Go" /></form>
</td>
<td class="n" rowspan="2" style="width:110px;">
<table>
<tbody>
<!-- NEXT/PREV MONTH NAV AND MONTH INDICATOR -->
<tr class="m">
<td class="b" nowrap="nowrap">Jul</td>
<td class="c" id="displayMonthEl" title="You are here: 01:28:29 Aug 24, 2012">AUG</td>
<td class="f" nowrap="nowrap"><a href="http://web.archive.org/web/20140105113758/http://therecreationclub.com/" title="05 Jan 2014"><strong>Jan</strong></a></td>
</tr>
<!-- NEXT/PREV CAPTURE NAV AND DAY OF MONTH INDICATOR -->
<tr class="d">
<td class="b" nowrap="nowrap"><img src="/_static/images/toolbar/wm_tb_prv_off.png" alt="Previous capture" width="14" height="16" border="0" /></td>
<td class="c" id="displayDayEl" style="width:34px;font-size:24px;white-space:nowrap;" title="You are here: 01:28:29 Aug 24, 2012">24</td>
<td class="f" nowrap="nowrap"><a href="http://web.archive.org/web/20140105113758/http://therecreationclub.com/" title="11:37:58 Jan 05, 2014"><img src="/_static/images/toolbar/wm_tb_nxt_on.png" alt="Next capture" width="14" height="16" border="0" /></a></td>
</tr>
<!-- NEXT/PREV YEAR NAV AND YEAR INDICATOR -->
<tr class="y">
<td class="b" nowrap="nowrap">2011</td>
<td class="c" id="displayYearEl" title="You are here: 01:28:29 Aug 24, 2012">2012</td>
<td class="f" nowrap="nowrap"><a href="http://web.archive.org/web/20140105113758/http://therecreationclub.com/" title="05 Jan 2014"><strong>2014</strong></a></td>
</tr>
</tbody>
</table>
</td>
</tr>
<tr>
<td class="s">
<div id="wm-nav-captures">
<a class="t" href="/web/20120824012829*/http://therecreationclub.com/" title="See a list of every capture for this URL">25 captures</a>
<div class="r" title="Timespan for captures of this URL">24 Aug 2012 - 18 Jul 2019</div>
</div>
</td>
<td class="k">
<a href="" id="wm-graph-anchor">
<div id="wm-ipp-sparkline" title="Explore captures for this URL" style="position: relative">
<canvas id="wm-sparkline-canvas" width="600" height="27" border="0"></canvas>
</div>
</a>
</td>
</tr>
</tbody>
</table>
<div style="position:absolute;bottom:0;right:2px;text-align:right;">
<a id="wm-expand" class="wm-btn wm-closed" href="#expand" onclick="__wm.ex(event);return false;"><span id="wm-expand-icon" class="iconochive-down-solid"></span> <span style="font-size:80%">About this capture</span></a>
</div>
</div>
<div id="wm-capinfo" style="border-top:1px solid #777;display:none; overflow: hidden">
<div style="background-color:#666;color:#fff;font-weight:bold;text-align:center">COLLECTED BY</div>
<div style="padding:3px;position:relative" id="wm-collected-by-content">
<div style="display:inline-block;vertical-align:top;width:50%;">
<span class="c-logo" style="background-image:url(https://archive.org/services/img/alexacrawls);"></span>
Organization: <a style="color:#33f;" href="https://archive.org/details/alexacrawls" target="_new"><span class="wm-title">Alexa Crawls</span></a>
<div style="max-height:75px;overflow:hidden;position:relative;">
<div style="position:absolute;top:0;left:0;width:100%;height:75px;background:linear-gradient(to bottom,rgba(255,255,255,0) 0%,rgba(255,255,255,0) 90%,rgba(255,255,255,255) 100%);"></div>
Starting in 1996, <a href="http://www.alexa.com/">Alexa Internet</a> has been donating their crawl data to the Internet Archive. Flowing in every day, these data are added to the <a href="http://web.archive.org/">Wayback Machine</a> after an embargo period.
</div>
</div>
<div style="display:inline-block;vertical-align:top;width:49%;">
<span class="c-logo" style="background-image:url(https://archive.org/services/img/alexacrawls)"></span>
<div>Collection: <a style="color:#33f;" href="https://archive.org/details/alexacrawls" target="_new"><span class="wm-title">Alexa Crawls</span></a></div>
<div style="max-height:75px;overflow:hidden;position:relative;">
<div style="position:absolute;top:0;left:0;width:100%;height:75px;background:linear-gradient(to bottom,rgba(255,255,255,0) 0%,rgba(255,255,255,0) 90%,rgba(255,255,255,255) 100%);"></div>
Starting in 1996, <a href="http://www.alexa.com/">Alexa Internet</a> has been donating their crawl data to the Internet Archive. Flowing in every day, these data are added to the <a href="http://web.archive.org/">Wayback Machine</a> after an embargo period.
</div>
</div>
</div>
<div style="background-color:#666;color:#fff;font-weight:bold;text-align:center" title="Timestamps for the elements of this page">TIMESTAMPS</div>
<div>
<div id="wm-capresources" style="margin:0 5px 5px 5px;max-height:250px;overflow-y:scroll !important"></div>
<div id="wm-capresources-loading" style="text-align:left;margin:0 20px 5px 5px;display:none"><img src="/_static/images/loading.gif" alt="loading" /></div>
</div>
</div></div></div></div><script type="text/javascript">
__wm.bt(600,27,25,2,"web","http://therecreationclub.com/","2012-08-24",1996,"/_static/",['css/banner-styles.css','css/iconochive.css']);
</script>
<!-- END WAYBACK TOOLBAR INSERT -->
Copy link to clipboard
Copied
Oh snap. I thought "insert HTML" would be like "insert code". Seriously, I find new ways to fail every day.
Apologies for the hideous mess I've just added.
Here's a link to a zip file of that mess.
Copy link to clipboard
Copied
sueb, read my reply #9 (starts with I THINK I FOUND IT!). It has exactly what you are looking for. If you need an explanation, just ask.
V/r,
^ _ ^
Copy link to clipboard
Copied
Aha! Now we're getting somewhere. Thank you.
Okay, I tried the suggested edit:
(<!-- BEGIN WAYBACK TOOLBAR INSERT -->(\s)+)([^<])+(<!-- END WAYBACK TOOLBAR INSERT -->)
I tried it on the above (hideous) HTML. Nothing replaced.
So, I slowly removed a bit of the HTML, a chunk at a time.
I got down to just a few javascript sources and...it worked. It seems that it struggles (as do I) with the complexity of the page.
In other words, I had to remove most of the page to get the RegEx to work.
Having said that, I couldn't duplicate it - that is, I couldn't make it work twice in a row.
Do you think maybe my old DW can't do regular expressions?
Thank you again for all your help.
Copy link to clipboard
Copied
Are you sure you've got 'Use Regular Expressions' checked? And 'Ignore White Spaces' and 'Match Case' unchecked?
Honestly, if the example you provided in your original post was copied/pasted from your code, then it should work. I guess I could have taken into consideration the possibility that not all instances are typed exactly the same (additional/spurious spaces, etc.) But if all instances are the same, my suggestion should have worked on all of them.
I'll go over it, again, and see if I could improve it a bit, and post that.
V/r,
^ _ ^
Copy link to clipboard
Copied
Okay, this isn't too different from my earlier suggestion. I've added 'zero or more' whitespaces inside the comment tags themselves, in case there is more than one space between <!-- and the text and then -->.
FIND: (<!--\s*BEGIN WAYBACK TOOLBAR INSERT\s*-->(\s)+)([^<])+(<!--\s*END WAYBACK TOOLBAR INSERT\s*-->)
REPLACE: $1$2$2$4
HTH,
^ _ ^
Copy link to clipboard
Copied
I've upped the game, a bit. I'm including a screenshot.
FIND: (<!--\s*[\w\s\t]+\s*-->(\s)+)([^<])+(<!--\s*[\w\s\t]+\s*-->)
REPLACE: $1$2$2$4
Screencap:
[\w\s\t]+ means 'one or more of any letter, space, or line break'. So any of the text between <!-- and --> can be anything, not just specific words. The screencap shows this in action. I've hit the 'Replace' button just once, and the top lines were altered. It is now highlighting the next to be changed if I hit the button, again.
HTH,
^ _ ^
UPDATE: I just realized that it won't consider punctuation. But you can modify that, if needed.
Copy link to clipboard
Copied
Wow. I really appreciate your efforts. Thank you.
How about a couple of videos (of about two seconds each)?
This one is your screenshot suggestion.
This one is your penultimate suggestion.
I'm beginning to think that it's either DW6 or perhaps DW6 Mac. This is the first time I've ever tried Regular Expressions on DW6. I even tried on Brackets (by Adobe) but it "no results" as soon as I put in the expression - having said that...I only assume I was in Reg Exp mode (horribly geeky interface).
This seemed like a good idea when I started out but now it's making my brain hurt.
I can't thank you enough for your efforts. Thank you.
Copy link to clipboard
Copied
Always glad to help, when I can. Unfortunately, where I work the links you provided are being blocked (they are very network security conscious - aka paranoid - around here.)
It _could_ be DW6 on a Mac, possibly. I've got DW5.5 on a Win10 system in our development environment and have no problems.
I hope you can find a solution before going insane.
V/r,
^ _ ^
Copy link to clipboard
Copied
Thank you again.
The videos were just confirmation that I had checked "Regular expression". And that I'd pasted in the correct things.
I'll just go back to doing it manually. I can churn through them if I get "in the zone".
Thanks again for your efforts and patience.