Copy link to clipboard
Copied
My office is undergoing a huge redesign, and all pages are being moved and/or having their extensions changed (from .htm to .cfm). Our webmaster has dictated that we need to do a redirect for every single page so that no one gets a 404 error.
We're talking thousands upon thousands of pages, and after getting some good advice in the Server Management forum here on Sitepoint, I came up with a way to do the redirects sort of automatically.
Since we've been keeping an Excel spreadsheet of the old and new page locations, I decided to import that spreadsheet into our database, and use the database to handle the redirects.
Basically, I built a custom 404 page and had our server guy point the webserver to it. The page takes the old url and queries the database to find it. If it finds a match, it plugs the new url for that page into a meta refresh tag and displays a message that the page has been moved. If no match is found, it displays the standard 404 message.
Since I want search engines to know that the pages have been moved, I decided to stuff a cfheader tag into the mix, but I'm not certain I'm doing it right. This is what I have in the head section of the 404 page:
<cfquery name="redirect" datasource="#ourdsn#">
SELECT new_url
FROM redirects
WHERE old_url = <cfqueryparam value="#requeststring#" cfsqltype="cf_sql_varchar">
</cfquery>
<cfif #redirect.recordcount# neq 0>
<cfheader name="location" value="http://www.ourdomain.com#redirect.new_url#">
<cfheader statuscode="301" statustext="Moved Permanently" />
<meta http-equiv="refresh" content="5;url=<cfoutput>#redirect.new_url#</cfoutput>" />
<cfelse>
<cfheader statuscode="404" statustext="Not Found" />
</cfif>
Is this the proper way to do this sort of thing, or am I totally off-base?
Copy link to clipboard
Copied
That is a pretty good way to do it, if you can't do the 301 permentant redirect in the web server. Which could be quite difficult in your situation it sounds like.
Just a couple of points.
1) <cfif #redirect.recordcount# neq 0> does not need pound signs.
<cfheader name="location" value="http://www.ourdomain.com#redirect.new_url#">
Does need pound sings, but it also needs to be inside an <cfoutput...> block to resovle the variable
which you did not show in your code sample. I.E.
2) <cfoutput><cfheader name="location" value="http://www.ourdomain.com#redirect.new_url#"></cfoutput>
3) The <meta....> tag is supposed to be inside of a proper html page. I.E.
<html>
<head>
<meta...>
.
.
.
</html>
But other then that, you seem to be on the right track.
Copy link to clipboard
Copied
To be honest, I would not involve CF in this process at all: it's the job of the web server to handle which page to serve, not CF's.
I'd use mod_rewrite (either Apache's or IIS's flavour of it, depending what you're running) and a rewrite map - which could be derived from you Excel file, I should think - to organise the rewrites.
Using your technique, you're doubling the request traffic for the CF server (once for the 404, once for the actual page), as well as hitting the DB for every single 404 as well. That's a lot of work which you could just get the web server to do (and that's what the web server is designed to do).
--
Adam
Copy link to clipboard
Copied
Any information you can point us to on this rewite map concept?
I to beleive that it should be done in the web server whenever possible. But I didn't know of a good way to manage a large number or redirects that do not follow any good set of paterns.
Copy link to clipboard
Copied
http://www.iis.net/download/URLRewrite/ => http://learn.iis.net/page.aspx/469/using-rewrite-maps-in-url-rewrite-module/
http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html => http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html#rewritemap
--
Adam
Copy link to clipboard
Copied
There is my "Something NEW" for they day, maybe the week.
Thanks
Copy link to clipboard
Copied
u can use a db to handle the volumn as long as you have a way to extract to a map file easily (this would be fairly quick to do once the rewrite process is in place) - i think helicon has a way of accessing the db automatically.
Copy link to clipboard
Copied
@Ian: I can never figure when to use the pound signs and when not to, so I always use them since they never seem to break it. As far as the <cfoutput>s, I was thinking they were not needed when inside of a ColdFusion tag. I'll add those. As far as where to put the meta tags, the code I pasted actually sits in the <head> tag of the 404 page, so everything is where it's supposed to be.
@Adam: Yep, I was trying to get our webmaster to do the redirects, but for some reason he thinks that he'd have to do them one-by-one. I have no access to the webserver, so I can't do them myself. As far as hitting the db for every redirect, I was going to cache the query using cachedwithin, and put it on something like a 7-day timespan to limit hits on the db. But I'll definitely forward the links you supplied.
Thanks everyone!
Copy link to clipboard
Copied
@Ian: I can never figure when to use the pound signs and when not to
Fortunately these things are well documented, so there's not need to figure anything out, one just needs to read the docs:
http://help.adobe.com/en_US/ColdFusion/9.0/Developing/WSc3ff6d0ea77859461172e0811cbec22c24-7fc3.html
@Adam: Yep, I was trying to get our webmaster to do the redirects, but for some reason he thinks that he'd have to do them one-by-one.
Hmmm. Well saving someone some effort should seldom be grounds for doing something a suboptimal way. But still, it I can see where they're coming from. What web server are you using? Maybe it's IIS and they're unaware IIS now has mod_rewrite (it's new to v7)? Even still, Helicon do s third party mod_rewrite for IIS, which works very well.
--
Adam
Copy link to clipboard
Copied
I'm pretty sure we use IIS6, and for whatever reason, they flat out refuse to install third party software. If MS or Adobe don't make it, they won't use it.
Copy link to clipboard
Copied
To preview the documentation Adam pointed you too.
You need pound signs when you are rendering a variable for output OR inside a string.
Thus in this line:
<cfoutput><cfheader name="location" value="http://www.ourdomain.com#redirect.new_url#"></cfoutput>
We are using the <cfoutput> tags because you are creating a dynamic string for the value property
of the <cfheader...> tag. If could be rewritten to eliminate the pound signs:
<cfoutput><cfheader name="location" value="http://www.ourdomain.com" & redirect.new_url></cfoutput>
But I think most ColdFusion developers would find that a less clear syntax.
Copy link to clipboard
Copied
Ian, when I do that, the <cfoutput> tags get piped to the URL string. It works fine without them though. Too fine, actually, as it behaves like a <cflocation> and performs an instantaneous redirect and bypasses the meta refresh tag altogether. As such, I've removed it and just left the statuscode one in.
Copy link to clipboard
Copied
Well most people want it to work immediately. That is the way it is supposed to work. The browser is told, "this is not here anymore it is over there. And browser goes "Ok, I'll get from over there right now!".
The metat refresh is usually considered an poor mans options of last resort, when one just can not use the more effecient and correct HTTP statuse response.
Copy link to clipboard
Copied
Then, how is the user supposed to know that the page has moved? The reason I've always used the refresh is to give the user some idea that he/she needs to update their bookmarks. If you just serve up the new page, they may notice it looks different, but they might not noticed that it's been moved.
If sometime down the road, the redirect gets nuked, the user's old bookmark will break, and you're back to a 404, which is what the redirect was supposed to avoid in the first place.
Copy link to clipboard
Copied
I would suggest you search for some of the good discussions about the hows and whys of redirecting content. There is a lot of information out there that I do not have the time at the moment to summerize here. But to make one brief point.
You can never 'nuke' a redirect once you put one in place and NOT risk getting 404 errors. Working with the assumption that your users have been notified that the page is moved AND that all users who have been such notified bothered to update their bookmarks. That still leaves out the set of users who may not have visited your page in the time frame that the redirect page existed. They will then get the dreaded 404 error because they where never told to update their bookmarks. This also does not account for users who are comming from old links from other web sites, printed material, or anywhere else the URL may have been published.
It is also possibe (but I'm not sure any current browser does this) that the client tool can automatically update a saved URI with new URI when it receives a proper permenant redirect. That is the way the standard reads at least.
Get ready! An upgraded Adobe Community experience is coming in January.
Learn more