301 redirect adding querystring, weird

I recently updated a site to make it dynamic and gave it a nice new design. The old site was just plain css and html and has about 11 listings on Google that i want to redirect to the new site. The new design is on the same domain. I’m just using a basic 301 redirect to do the task but am getting some weird results.

My redirect rule is:

Redirect 301 /old_page.htm http://www.mydomain.co.uk/new-page.htm

This is what I’m getting:

Why is it adding a querystring at the end of the URL? Is there a way to get to remove the querystring and leave just the redirected URL?

Many thanks

I suspect you REDIRECT is not how you are arriving at the new page. You may be fooling yourself.
Verify the REDIRECT by temporarily removing it - attempt to access the page (with the browser cache disabled) - then reinstate it.

Not quite sure what you’re saying here. Do you suspect it to be a browser cache issue? I’ve tried on both Chrome and Firefox and both are the same. I will try clearing my cache and see what it does. Thanks

My guess is that there is more code at play than just that one redirect; either in the .htaccess or in the server config.

Here’s my .htaccess file


Options +FollowSymLinks
RewriteEngine on

RewriteRule ^index\\.htm index\\.php
RewriteRule ^gallery\\.htm gallery\\.php
RewriteRule ^jobs\\.htm jobs\\.php
RewriteRule ^information\\.htm information\\.php

RewriteRule ^([a-zA-Z0-9_-]+)\\.htm$ page.php?&id=$1

redirect 301 /new_page.htm http://www.mydomain.co.uk/new-page.htm

Hope this helps. I’m a bit lost to be honest

The problem is with the line


RewriteRule ^([a-zA-Z0-9_-]+)\\.htm$ page.php?&id=$1

  1. Change that to

RewriteCond %{REQUEST_FILENAME}\\.htm !-f
RewriteRule ^([a-zA-Z0-9_-]+)\\.htm$ page.php?&id=$1 [L]

  1. Add [L] at the end of all rewriterules

  2. Put the Redirect at the top of the .htaccess, above RewriteEngine On

That should work :slight_smile:

Hi,

I have the same problem. I have put the redirects at the top of the .htaccess above everything else, ensured all RewriteRule statements have an [L] after them and I am STILL getting a query string appended… rats! Any ideas?

/Brian

Hi Brian,

Can you please post your complete .htaccess file so we can have a look?

Thanks.

Hi,

Thanks for taking a look. Here you go… its quite long:


redirect 301 /?filename=-Web-Design-Northern-Ireland-Index.html http://www.abipo.com
redirect 301 /Abipo-Web-Design-Moira-Northern-Ireland-Index.html http://www.abipo.com
redirect 301 /Abipo-Web-Design-Northern-Ireland-Index.html http://www.abipo.com
redirect 301 /Abip http://www..com
http://www.mydomain.co.uk




redirect 301 /contact.html http://www.mydomain.com/mydomain-Web-Design-Northern-Ireland-Contact.html
redirect 301 /mydomain-Web-Design-Northern-Ireland-Contact.html?filename=Northern-Ireland/County-Antrim/mydomain-Web-Design-Northern-Ireland-Contact.html http://www.mydomain.com/mydomain-Web-Design-Northern-Ireland-Contact.html
redirect 301 /questions.html http://www.mydomain.com/mydomain-Web-Design-Northern-Ireland-Questions.html




redirect gone /images/BarAbout.html
redirect gone /images/BarBlog.html
redirect gone /images/BarContact.html
redirect gone /images/BarHome.html
redirect gone /images/BarQuestions.html
redirect gone /images/BarSearch.html
redirect gone /images/BarServices.html


redirect gone /images/BarAbout.swf.html
redirect gone /images/BarBlog.swf.html
redirect gone /images/BarContact.swf.html
redirect gone /images/BarHome.swf.html
redirect gone /images/BarQuestions.swf.html
redirect gone /images/BarSearch.swf.html
redirect gone /images/BarServices.swf.html


redirect gone /94303Directory.php


RedirectMatch gone /drupal/*




# compress the files:
# AddOutputFilterByType DEFLATE text/html text/plain text/css text/javascript application/x-javascript


#
#
#SetOutputFilter DEFLATE
#
#


#
#Header set Expires "Thu, 15 Apr 2014 20:00:00 GMT"
#



RewriteEngine On
# non-www to www Redirect In Apache - Based on code from http://www.websitetodos.com/
RewriteCond %{HTTP_HOST} ^mydomain\\.com$
RewriteRule (.*) http://www.mydomain.com/$1 [R=301,L]


RewriteRule ^$ /vr_display_95bae8.php?filename=index.html [L,NC]
RewriteRule ^(.*)\\.html$ /vr_display_95bae8.php?filename=$1.html [L,NC]
RewriteRule ^(.*)\\.htm$ /vr_display_95bae8.php?filename=$1.htm [L,NC]

fs,

Eleven redirections? Surely there are more in your .htaccess file.

Your mod_alias Redirect statements are okay - syntax Redirect {code} {URI} {absolute redirections}.

You are getting a query string because you directed it to be added with mod_rewrite (RewriteRule ^([a-zA-Z0-9_-]+)\.htm$ page.php?&id=$1). If you do not want a query string, do not use the QSA flag, do not add a query string and, if you want to remove any existing query string, add a ? at the end of the redirection (mod_rewrite).

The RewriteRules you first showed would not likely match through to the last RewriteRule because they are ANDed as you go through because you failed to utilize the Last flag (read the sticky posts or the tutorial Article linked in my signature).

The reason for ordering Redirects before RewriteRules is that mod_alias is part of Apache’s core and, therefore, its directives are handled by Apache before mod_rewrite directives. Of course, you can put them just about anywhere but (a) that’s not the order that Apache handles the code so (b) you should order your thinking to match Apache’s ordering to avoid confusion.

Rémon, you KNOW that the Redirects changed the extension from .htm to .php so your {REQUEST_FILENAME}\.htm (where was the extension removed so you needed to add .htm?) would not be matched.

fs,

The first thing to learn about .htaccess is to keep it empty (if at all possible) because .htaccess must be read AT LEAST ONCE for every file request (including .css, .js, .gif, .jpg, etc.). If one of my clients used something like that, they would be invited to find a host which would allow abuse of the shared server - I do not!

Then, I don’t believe that mod_alias even looks at query strings (many of your {old URI}'s do).

mod_rewrite’s (and some mod_alias directives) use of regular expressions is designed to eliminate the type of abuse that you’ve demonstrated in the “quite long” series of Redirect statements. In other words, look for patterns to use to cut this nonsense down!

Other comments (on your mod_rewrite code):

RewriteEngine On
# non-www to www Redirect In Apache - Based on code from http://www.websitetodos.com/
RewriteCond %{HTTP_HOST} ^mydomain\\.com$ [COLOR=#0000FF][NC][/COLOR]
RewriteRule (.*) http://www.mydomain.com/$1 [R=301,L]
# OR RewriteRule .? http://www.mydomain.com%{REQUEST_URI} [R=301,L]
# because you're not using the (.*) in the redirection
# AND it's already available as %{REQUEST_URI}, i.e., efficiency

RewriteRule ^$ /vr_display_95bae8.php?filename=index.html [L,NC]
# to match http://www.mydomain.com/.html - why not REQUIRE letters, i.e., [a-z]+?
RewriteRule ^(.*)\\.html$ /vr_display_95bae8.php?filename=$1.html [L,NC]
# ditto; and combine both by using html?$ which makes the "l" optional
RewriteRule ^(.*)\\.htm$ /vr_display_95bae8.php?filename=$1.htm [L,NC]

Please have a think about all this - THEN come back with a rephrased question so we can help you learn.

Regards,

DK

Hi David,

Thanks for taking the time out to provide some insights. Yes, I’m afraid that the .htaccess file is something I have never sat down to learn properly… rather like a lot of people I suspect, my knowledge has just evolved as needed and if it works… great :slight_smile: I know this isn’t the best way of course.

To be honest, I’m not sure if I can use regular expressions for these urls as there is no real pattern. Ie, each targeted url is a problem url sitting in amongst many that are not… so, for example, I cannot really target a given entire directory.

I’ve made your suggested changes with regards to the canonical non-www to www. I think I’m happy leaving the “vr_display” stuff as this is externally supplied.

So I still have my problem though… and here is a rephrase of the question.

If you put this url in your browser:

www.mydomain.com/Northern-Ireland/County-Armagh/Armagh-Gilford.html

Why does it redirect and change the url to:

http://www.mydomain.com/Northern-Ireland/County-Down/Down-Gilford.html?filename=Northern-Ireland/County-Armagh/Armagh-Gilford.html

Many thanks,

/Brian

bw,

It’s because you have a Redirect:

redirect 301 /Northern-Ireland/County-Armagh/Armagh-Gilford.html http://www.abipo.com/Northern-Ireland/County-Down/Down-Gilford.html

Pretty simple, eh?

If you have a list of URIs that need redirected, I would recommend that you ask your host to allow a RewriteMap for those. When they refuse (because you can bring down the whole server), then use PHP header() redirections within those scripts (if strictly HTML, use the <meta http-equiv=“refresh” content=“0; url={new URL}”>). To do otherwise, as explained above, is an abuse of the server and SHOULD have you barred from hosting on a shared server.

Regards,

DK

Hi DK,

Sorry still don’t get why it’s appending a query string. The Gilford redirect is redirecting from one URL to another just like the rest of them. Can you go into a bit more detail pls?

Thanks for the heads-up to better ways to implement these 301s. As I really don’t want to bring a shared server down, I’ll maybe opt for the HTML alternative (use the <meta http-equiv=“refresh” content=“0; url={new URL}”>).

Thanks

/Brian

brian,

Sorry, I didn’t pick-up on the query string - that was added by your Drupal code:

RewriteRule ^(.*)\\.htm$ /vr_display_95bae8.php?filename=$1.htm [L,NC]

With this, vr_display_yadda_yadda will be hidden by the lack of absolute redirection (the visitor should not see the query string but it should be there with it gets to yadda_yadda).

No worries about the ways to prevent abusing your server … and an glad that you understand what a massive problem it can be (especially on a shared server). Since it’s one thing to throw rocks and another to offer solutions, that’s what you got. I’m glad that you understand that you are in control of your website (and its impact on the shared server) and will take appropriate action (the best solution you have under the circumstances).

Regards,

DK

Thanks DK,

That’s exactly what I was after! One thing though, you say that “the visitor should not see the query string but it should be there with it gets to yadda_yadda” and that was my understanding of the RewriteRule too written in this context (see section “RewriteRule Syntax” on this page: http://www.seomoz.org/blog/rewriterule-split-personality-explained ). Ie the RewriteRule serves up the new content pretending it was the old content that Google Bot / browser asked for. So… why is the query string actually visible after being redirected in the Gilford redirect?

/Brian

bw,

That SHUDDA been WHEN it gets to yadda_yadda.

Indeed, the query string should not be visible but, looking through your .htaccess from the top and not being familiar with whatever Drupal does with a URI, it’s obvious that there’s something else going on which is not being shown. Yes, even the server and virtual host configuration files can play a part!

Regards,

DK

Hi DK,

I don’t have Drupal installed and its not in use in the entire domain.

I asked my host if they would prefer me to use something else for 301 redirects and they said No, keep using the .htaccess.

How exactly can one bring down a shared server with 301s in the .htaccess out of interest?

/Brian

brian,

I thought I saw Drupal in what you’d shown (was that because it wasn’t the “standard” WP code?) so my apologies. CMS’s tend to have nearly identical code to redirect EVERYTHING (without a valid destination) to its index.php file to handle the request (if it can) or act as a 404 handler.

Apache’s mod_alias (the Redirect series of directives) is the fastest redirection (and 301) but mod_rewrite adds the power of regex and its RewriteCond statements. Either are fine UNLESS you have an immense series of redirections which must be read and parsed multiple times for every file request (an abuse of the server but, if your host doesn’t care, you can get away with it … but I would change hosts if I knew they allowed the abuse which would slow my site(s)). IMHO, though, your list from the earlier post isn’t bad enough to even come close to abuse (but could be optimized with some regex and mod_rewrite).

Upon changing from Apache 1.x to Apache 2.x, the looping problem (Apache 1.x would require a restart) was solved by limiting the number of loops Apache 2 would allow (10). The result is that, today, 301’s won’t bring a server down (other than slowing it down if there is a long list). That said, mod_rewrite’s RewriteMap can bring a server down with a syntax error - that’s why it’s use is limited to those with access to the server or virtual host configuration files (typically not accessible by webmasters).

Regards,

DK