Correct HTTP code for redirecting misspelled / mangled URLs?

What do you think is the correct HTTP response code for redirecting misspelled or otherwise mangled URLs to the correct URL? (Assuming the correct URL can be inferred from the request).

Google suggests 301: http://googlewebmastercentral.blogspot.co.uk/2008/08/now-that-weve-bid-farewell-to-soft-404s.html

Should I 301-redirect misspelled 404s to the correct URL?
Redirecting/301-ing 404s is a good idea when it’s helpful to users (i.e. not confusing like soft 404s). For instance, if you notice that the Crawl Errors of Webmaster Tools shows a 404 for a misspelled version of your URL, feel free to 301 the misspelled version of the URL to the correct version.

However, this doesn’t seem quite right to me as 301 is ‘Moved Permanently’. But the resource hasn’t been moved, it never existed at the requested URI.

What are your thoughts on this?

Given that 301 (permanent) and 302 (temporary) redirects are the only two possibilities here from the 300 series then I’d use a 301, simply because it’s a permanent rule and not a temporary one.

What about 303 See Other? That seems more fitting to me - we don’t have a representation of the requested resource, but we’re referring them to a resource might result in a representation that is useful to the recipient.

I wouldn’t get too hung up on the names, because this is a case where the spec and the implementations don’t quite match up.

Here’s an example that’s common practice. Let’s say we submit a form via POST to a URL /submit. That URL does some processing then returns a 302 response, redirecting to a URL /view. Yet this behavior doesn’t quite match the spec:

If the 302 status code is received in response to a request other than GET or HEAD, the user agent MUST NOT automatically redirect the request unless it can be confirmed by the user, since this might change the conditions under which the request was issued.

But browsers certainly don’t ask for permission every time a site 302 redirects after a POST request. The behavior we observe for a 302 actually matches the spec’s description of a 303:

The response to the request can be found under a different URI and SHOULD be retrieved using a GET method on that resource. This method exists primarily to allow the output of a POST-activated script to redirect the user agent to a selected resource.

So, in theory, a 302 and a 303 should behave differently, but in practice, they behave identically.

In light of this, I suggest we don’t get too hung up on the names and instead think about the desired behavior. Such as:

  1. Do you want the redirect response to be cacheable? Meaning, if a user visits the misspelled URL a second time, should the browser double check that the redirect is still in place? Or should it use its cache to know about the redirect and just request the corrected URL? If you want the redirect response to be cacheable, then a 301 is your simplest option.

Or 2) if someone submits a POST request to a misspelled URL, do you want to ignore the POST data and just have the user GET request the corrected URL? (If so, then 302 or 303.) Or do you want to keep the POST data intact and have the user POST request the corrected URL? The browser will ask the user to confirm re-submitting the POST data. (If so, then 301 or 307.)

All things considered, I think 301 is your best bet. For a misspelled URL, I think cacheable is desirable. And for POST requests… well, in this scenario, we probably don’t care too much one way or the other… but the safest option is to not throw away the POST data and instead ask the user what they want to do.

2 Likes

dj,

If you enable mod_speling, you should get 200’s because mod_speling correct minor typos and capitalization errors.

Regards,

DK

Thanks Jeff, after reading your well considered post I think I’ll have to agree with you and go for a 301.

Thanks for your input too DK. I actually use Nginx for most of the sites I manage, but it’s interesting to know about that apache module.