Rewrite for canonicalization - differences

This is the code I have currently in my htaccess for canonicalization issues:

RewriteCond %{HTTP_HOST} !^www\.callput\.com.au$ [NC]
RewriteRule .? http://www.callput.com.au%{REQUEST_URI} [L,R=301]

I’ve stumbled upon another set of rules:

RewriteCond %{HTTP_HOST} ^callput.com.au [NC]
RewriteRule ^(.*)$ http://www.callput.com.au/$1 [L,R=301]

I am trying to work out the differences (or advantages) of either set. This tutorial has helped a little (http://www.widexl.com/tutorials/mod_rewrite.html) but I don’t understand the “?” quantifier in the first set or the escapes “\”.

Also, I have a number of RewriteRule under the ones above, do I leave out the [L] until the last one or is this unnecessary?

In regular expressions, periods have special meaning. A period means any character may be matched in that spot. If, on the other hand, you wanted to match a literal period, then you would escape it with a backslash.

We now know that a period will match any character. The question mark has special meaning also. It means match the thing before it 0 or 1 times – that is, the thing before it is optional. So the pattern .? means “optionally match any character.” This is a trick to ensure that the pattern will always match. In this case, this was done because the real conditions to test were done in the preceding RewriteCond line.

In the first set of rules, the condition is checking if the domain is not www.callput.com.au, and in the second set of rules the condition is checking if the domain is exactly callput.com.au (though the second condition needs an end-of-string anchor and to escape the periods). This difference means that the first condition will match any subdomain: abc.callput.com.au, yadayada.callput.com.au, etc, or just callput.com.au, and replace it with www.callput.com.au. Whereas the second condition will match only callput.com.au and replace it with www.callput.com.au. In this case, neither is inherently better. It depends on which behavior you want.

The next difference, which is less important, is how they get the URL path to create the new, resulting URL. The first set of rules uses an existing variable: %{REQUEST_URI}. The second set of rules captures (that’s what the parentheses do) the URL that the rewrite rule is operating on, and uses that captured value in the replacement URL (that’s the $1 variable). It’s very rare that either way will produce a different result. The second set of rules can produce a different result if there were any preceding rewrite rules that already altered the URL, and it may or may not be desirable to retain those alterations. In this case, the capturing used in the second set seems to be the prevailing choice among most developers, and it’s also how the Apache documentation shows it being done.

You will almost always want to use [L] with any rewrite rule that performs a redirection.

Eventually, consider reading through the Apache documentation. It’s the most definitive and comprehensive source of information you’ll ever find on this subject.