Changing Link Structure: 301 Redirect to Possible 404 or 404 First?

Here is my issue. We are changing our linking structure from

to just

Obviously I want to 301 redirect (using nginx/mod_rewrite), but do not know if I should care about poorly/incorrectly written incoming links. Such as a site that miswrites our url as

/trailer/movie-title/video-titlez

They accidinatelly put a z on the back of the video title, and we’d respond with a 404 response when the page is open. Now, with the mod_rewrite, I would be 301 redirecting this bad url to the PHP file that analyzes, and then gives the 404 response.

Is this okay way to do it? I’ve been told I don’t need to worry about what SEs like Google thinks of 301 to 404 since the page was a longtime 404 anyway.

The only other way to do it is leave a PHP file at the old url address and have it check the database and confirm the link is correct before forwarding on to the proper url if item exists or posting 404 response if not, which unfortunately means the 301 forwarding would mean my database would then need to effectively be checked twice if item exists, since the new url and PHP attached would double-confirm.

Seeking feedback on this.

Thanks!
Ryan

Hi cb!

You didn’t show your .htaccess code so any comment cannot deal with your entire question.

Kudos for understanding that you probably don’t need “trailer” in your URIs.

It sounds like you’ve created what I’ve termed “A Poor Man’s RewriteMap” with your request handler (the one that checks your database before redirecting to the requested file OR 404 handler). Kudos for that, too, EXCEPT that it means a delay for every file request (or every request meeting the ^([-a-z]+/[-a-z]+)$ format of your movie-title/video-title URIs).

I have done something similar (article title as the request) at the wilderness-wally.com website but my redirects are to the file handler which, if it finds the article, outputs it and, if it doesn’t, redirects to the Home Page (hopefully with a “Requested Article Not Found - Please use the TOC on the left” comment). The obvious difference is that I’ve skipped the intervening file which does your -f check (database required for this when not using file names) and embedding the “Lost and Found” in the article script.

The one caution is to check on the allowable characters in URIs Uniform Resource Identifiers (URI): Generic Syntax (if you need my list of allowable characters, PM me) before redirecting to either the article handler or your request handler.

Back to your question: IF your request handler cares about “trailer/” in the URI, have it check first. If not, strip the “trailer/” before the redirection (but only because that is your new format.

WARNING: Before you change the format of your URIs, do NOT make the change if you’re including the dot character in the movie-title or video-title as that is my “marker” to send !-f requests to my article handler. I believe that you’ve used trailer as your “marker” so, if that is suddenly missing, it could disable your website.

Not enough information in the question => too much information in the response. Oh, well, at least you have the full story for your consideration.

Regards,

DK

I have nginx rewrites, so would need to look at them to give good idea of what I’m trying.

location /trailer {
rewrite ^/?trailer/([-0-9a-zA-Z,-]+)/([-0-9a-zA-Z,-]+)?/?([0-9a-zA-Z,-]+)?$ /watchtrailer.php?fkey=$1&tkey=$2&tres=$3;
}

Now, this is the original link that works. I could leave the watchtrailer.php file to remain, allow that to give the variables supplied a test against the database. If row exists, it can then 301 redirect to the proper url (as rewritten below) or just send a 404. All done from the php file itself.

if (!-f $request_filename) {
rewrite ^/?([0-9a-zA-Z,-]+)?/?([0-9a-zA-Z,-]+)?/?([0-9a-zA-Z,-]+)?/?([0-9a-zA-Z,-]+)? /index.php?var1=$1&var2=$2&var3=$3&var4=$4;
}

Included in the index file is the proper new-watchtrailer.php to handle the variables. So, with this, I’d have PHP handling everything in terms of doing 301 or 404. It’s okay this way, but sucks in regards that I will be checking the database twice per request to the old url (one from watchtrailer.php and then again from new-watchtrailer.php). Not an issue, save the old url will likely be hit about 1M times per day at the beginning.

I was hoping to go easier, and just do

location /trailer {
rewrite ^/?trailer/([-0-9a-zA-Z,-]+)/([-0-9a-zA-Z,-]+)?/?([0-9a-zA-Z,-]+)?$ /$1/$2 permanent;
}

So just have nginx handle the 301 redirect right away, making new-watchtrailer.php handle the request and decide whether to 200 or 404 response at that point.

Let me know if that makes sense.

Cheers!
Ryan

cb,

Ouch! Obviously, nginx rewrites are quite different than Apache’s mod_rewrite! I can’t help you within the nginx world.

Taking a look at your code, though, it appears that you could do a better job of making atoms optional (rather than everything along the way, i.e., frequent use of /?). Examples embedded in your quote:

Regards,

DK

I actually figured it out. And I did the same way you mentioned. :slight_smile: One of the few things that was more simple with nginx

location /trailer {
rewrite ^/?trailer/([-0-9a-zA-Z,-]+)/([-0-9a-zA-Z,-]+)$ /$1/$2 permanent;
}

Then to new-watchtrailer.php, which supplies 200 or 404. Works great. I also discovered I was doing the 404 in PHP not entirely right, which was good to find.

I’ll test your rewrite structure fix and removal of the commas.

Cheers!
Ryan

Pulled out the “/trailer” level. Which leads to having the index.php page to open, include
I didn’t need the third level/variable. The addition of “permanent” works perfect for the redirect, and I’ve tested in a 301/302 checker and it does a perfect 301 to a 200. And it does do the 301 to the