Please improve my .htaccess / mod rewrite code

So far I have this:

It redirects all requests (if needed) to https://www.domain.com/page/var1/var2/ etc. and rewrites so index.php receives all requests.


RewriteEngine On

# required on my server
RewriteBase /

# redirect to full correct url if missing trailing slash
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ https://www.domain.co.uk/$1/ [R=301,L]

# redirect to full correct url if not complete
rewritecond %{http_host} ^domain.co.uk [nc,OR]
# the next line may be specific to my server
RewriteCond %{ENV:HTTPS} !on [NC]
RewriteRule ^(.*)$ https://www.domain.co.uk/$1 [R=301,L]

# use index.php for all requests
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . index.php [L,NS]

…but I could do with help on two things:

(1) I’m guessing the code could be written better. It works perfectly, but I’m sure it could be condensed.

(2) I also need to add in where if they only go to domain.com (and no page / url vars etc. selected) that it redirects to domain.com/home/. How would I do this without just duplicating another chunk of code again?

JS153,

Comments embedded in your code:


RewriteEngine On

# required on my server
RewriteBase /
[COLOR="#0000FF"]# BS - that's designed to UNDO mod_alias redirections so mod_rewrite can work on a request.
# While it shouldn't hurt, it's incorrect[/COLOR]

# redirect to full correct url if missing trailing slash
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ https://www.domain.co.uk/$1/ [R=301,L]
[COLOR="#0000FF"]# IMHO, it's EXTREMELY bad technique to add a trailing / to anything other than a directory request.
# Why in the world would you go out of your way to do this?[/COLOR]

# redirect to full correct url if not complete
rewritecond %{http_host} ^domain.co.uk [nc,OR]
# the next line may be specific to my server
RewriteCond %{[COLOR="#FF0000"]ENV:[/COLOR]HTTPS} !on [COLOR="#FF0000"][NC][/COLOR]
RewriteRule ^(.*)$ https://www.domain.co.uk/$1 [R=301,L]
[COLOR="#0000FF"]# Is there a requirement to use https? This seems to be another very silly thing to do[/COLOR]

# use index.php for all requests
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule .[COLOR="#0000FF"]?[/COLOR] index.php [L,[COLOR="#800080"]NS[/COLOR]]
# First, . requires a character while .? allows just the domain request
# NS (not for internal sub-requests).

[quote="Apache.org"]
This flag forces the rewrite engine to skip a rewrite rule if the current request is an internal sub-request. For instance, sub-requests occur internally in Apache when mod_include tries to find out information about possible directory default files (index.xxx). On sub-requests it is not always useful, and[I] can even cause errors[/I], if the complete set of rules are applied. Use this flag to exclude some rules.
[/quote]

# Is this really appropriate?

Regards,

DK

Thanks for posting, but your comments are a little odd.

(1) As for using “RewriteBase /” then all I’ve done is follow this: http://www.rackspace.com/knowledge_center/index.php/Why_is_mod_rewrite_not_working_on_my_site. If they’re wrong, then surely it’s not a big deal anyway is it? Are they wrong about their own server?

(2) The trailing slash helps prevent duplicate content for SEO (http://www.seroundtable.com/archives/022083.html). As you control all links on your site, then you just ensure all links include the trailing slash, which means htaccess rarely needs to redirect it. If any come externally then .htaccess does the redirect, which helps prevent duplicate content. Try going to apple.com/ipad and you’ll see they do exactly this.

If you’re saying that my code will add a trailing slash to every url, and it’s only needed for a directory, then that is exactly the kind of thing I need help on. That said, all links will be in the format domain.com/page/var/var2/var3/ etc. anyway.

(3) HTTPS, yes there is a need to use all-https on the site. The site involves content where its security is paramount. Many sites are going https-only these days, such as odesk.com. By going to odesk.com you will see the https redirect.

(4) I ask for help improving the code, and you seem to spend ages commenting on it, but haven’t offered a better way to write the code. I understand nobody has to help on a forum, and I respect that, but you did take the time to comment.

(5) You say, “I’ll let you place that in the mess above”. Well, obviously I can’t, that’s why I was asking. Thanks for providing the line of code though, I just need to know if it needs to be merged in with other parts of the code. Again, you say its a mess, so that is why I’m asking for help re-writing it.

(6) Also, you’ve highlighted ENV and [NC] in this:


# the next line may be specific to my server
RewriteCond %{ENV:HTTPS} !on [NC]

…is there a problem, as again that’s come from my server owners specifically (http://www.rackspace.com/knowledge_center/index.php/How_do_I_force_SSL_on_my_PHP_site)?

I would still appreciate help from anyone who could spare the time. I can’t / won’t change the things dklynn has commented on though as they are important, but I do need help improving and condensing the code.

Thanks again for taking the time though dklynn. It has at least made me keen to get additional help and make additional checks before I sign this off.

js,

Rackspace controls their own servers and I have no idea what they’re doing with them, however, I stand by my comments.

Regards,

DK

Thanks agin for the response. I do appreciate the time you’ve taken.

For some reason you’ve mistaken my post for someone who thinks they’re an expert at .htaccess / mod-rewrite or whatever the best name for this process is. I am simply someone trying to get a website to do what I want it to. If I had posted “I’m brilliant, look at my great .htacces code”, then your responses would make 100% sense. As for Rackspace vs you then I really couldn’t care less. At this stage I’d say you probably know more than anyone over there. I don’t need to start questioning their advice when it does nothing wrong.

You twice mention for me to ask Rackspace why they do that and this, but again I really couldn’t care. I’m no expert, I just needed help. If Rackspace are wrong then great, but I may as well include the “RewriteBase /” considering they do recommend it and I can’t see it harming anything. Rackspace told me to do it, it seems to work, so what’s the problem?

About the apple.com/ipad thing, my aim (and if my htaccess code doesn’t do this then that is where I am mistaken) is that someone goes to domain.com/page and it redirects to domain.com/page/ - this is what apple.com/ipad does. If my .htaccess code doesn’t do this then that is where I am mistaken.

By redirecting all “domain.com/page” requests to “domain.com/page/” using a 301 redirect will then ensure that Google index both as the same page, preventing duplicate content issues. You say “where’s the trailing slash on apple.com/ipad”, well go to the link and you’ll see Apple redirect (301) and append the trailing slash. Understanding your knowledge on SEO would help me. Mine’s probably 2/10. Do you have any skills on SEO (if you do then I need to look at this, but if you know nothing about SEO then this would probably explain it)?

Thanks for the advice regarding “domain.com/page/var/var2/var3” and the “series of ‘pseudo directories’” thing - I actually ensure all links are absolute anyway which I think resolves this. I’ve never had a problem actually with this, but I’m guessing a lot of people do.

Also, does my code add a trailing slash to every url? If it adds the trailing slash like apple.com does then that is my aim, but if it’s adding it to requests it shouldn’t then I need to change that.

Also, I posted 10 lines of code in what I would call three chunks. First it does the the trailing slash, then it does the https://www.domain.com/page/var/ redirect and then it ensures index.php is used on all requests. Can these 10 lines be condensed by keeping all the features? I was expecting someone to say, “that can be done in 4 lines” or that kind of thing. If you think that the fact I have gone with those decisions is stupid, then that’s one thing, but it’s whether my stupid decisions are coded correctly that I wanted help on.

I’m thinking now that you think the code is reasonable (not perfect though I’m sure), but it’s the decision to listen to Rackspace and go with the SEO stuff you are questioning. I first thought you suggested the whole thing should be re-done.

In fact, the bit about “I’ll let you place that in the mess above” - I expected it to be merged in with the rest, but it looks like that’s not possible. So if I just need to add that as another line of code then I can easily do that. I just thought there’d be a better way to compress it. As an example, the way I’ve separately entered the trailings slash redirect and the redirect all to https://www.domain.com/page/var etc. - I expected those two chunks to easily be condensable into one (say 2/3 lines for the lot).

You also say, “That was a challenge for you … to see whether you understand mod_rewrite or just want someone to script for you.” - no, neither actually and I think this post shows that I don’t understand mod_rewrite and I think it shows I haven’t just expected someone to script for me.

If the only problem with my code is that I listened to Rackspace advice, followed SEO advice regarding trailing slash and am using https across the whole site then I actually think the code is ok. Unless there’s something else I’ve messed?

I really think you have misunderstood the idea for my post. My post was aimed to say “this is my code, please help me improve” and nothing else.

I know nothing, you know everything. What more can I say!

Alright, short and sweet: apart from maybe removing the RewriteBase your code can’t be condensed any further than it is now.

And I agree with everything David has said so far :slight_smile: