Regex not working in .htaccess file

I am trying to match this pattern with .htaccess
From:

http://localhost/mainfolder/inc/article.inc.php

To:

http://localhost/mainfolder/article

Also
From:

http://localhost/mainfolder/index.php

To:

http://localhost/mainfolder/index

Or:

http://localhost/mainfolder/index/

This is my code:

RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule \/inc\/([A-Za-z]+)$ \/inc\/$1.inc.php [NC,L]

It doesn’t work. I also want to add another line for index.php to index/ or index

Hi A2,

You might benefit from reading the mod_rewrite tutorial http://dk.co.nz/seo as it contains explanations and sample code. It’s helped may members and should help you, too.

That article covers almost everything you’ll need to know about mod_rewrite:

  1. You should be redirecting from a URI format to a file which Apache can serve (you stated the problem backward but appear to be redirecting to .inc.php files).

  2. There are specific characters within regular expressions which require escaping with the backslash ( \ ) and you’re on OVERKILL. They are, dare I say, NEVER used in the redirection!

On the other hand, you’re to be complimented on the check for an existing file before attempting to match and make a redirection.

Compliments, too, for NOT using the :fire: EVERYTHING :fire: atom which is the most common problem newbies make.

THEN, before starting to code, where do you intend to put the .htaccess code for your redirection? It appears that localhost is the DocumentRoot (Apache’s http_docs folder?) and I have to admit that I prefer to use the .htaccess file in the DocumentRoot when it has general applicability. Because it seems as if you are using mainfolder, I’ll suggest code for that directory.

/mainfolder/article => /mainfolder/inc/article.inc.php

Working from YOUR code:

RewriteEngine On # great
RewriteCond %{REQUEST_FILENAME} !-f # even better
RewriteRule \/inc\/([A-Za-z]+)$ \/inc\/$1.inc.php [NC,L] # many problems
# First, because you're not in the DocumentRoot, you may need the leading /
#    (NOT escaped and NOT in the redirection as that specifies the SERVER's root then,
#    if the file is not found there, the DocumentRoot)
# Second, the second / should not be escaped either
# Third, IF article is only lowercase and capital letters, GREAT!
# Fourth, NO backshashes in the redirection!!!
# Fifth, are all your article files actually article.inc.php?
#    I ask this because the inc is generally reserved for an included script
# Finally, NEVER use the No Case flag in a RewriteRule (which only tests the case sensitive {REQUEST_URI} variable.
#    NC is designed to be used on the {HTTP_HOST} which is NOT case sensitive. 
#    Besides, you were smart enough to have specified both uppercase and lowercase in your regex so the NC is superfluous

The result would be:

RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule inc/([A_Za-z]+)$ inc/$1.inc.php [L]

Now, that code will NOT redirect index to index.php. If you look back at the Fifth comment above, you should be able to see why: The lack of the inc directory and the embedded.inc file extension!

That means that you will need to modify your regex to make it applicable to article.inc.php and index.php (I’ll not go into that here as that code would be more complex) so you’ll need to repeat the above code but omit the .inc in the redirection (I would also recommend specifying inc/index in the regex and inc/index.php in the redirection). Simply add:

RewriteRule index$ index.php [L]

I have preached for years that, when starting to code a mod_rewrite direction, you should start with a simple word explanation of what the code is supposed to do. That means specifying the {REQUEST_URI} to be matched (including exclusions - useful for RewriteCond exclusions) and then the redirection target. When you can do that, the coding becomes trivial.

Regards,

DK

Thank you for the amazing reply :smile:
I finished the code before though and here is the final result

# Turn the engine on
RewriteEngine On

# Don't do anything to files, directories and variables that already exist
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-l

# This rewrites article/number/whatever-it-is
RewriteRule ^artikull/([0-9]+)/([A-Za-z0-9_-]+)$ inc/article.inc.php?id=$1&title=$2 [NC,L]

# index.php rewrites to index
RewriteRule ^([^\.]+)$ $1.php [NC,L]

# NC makes the rule non case sensitive
# L makes this the last rule that this specific condition will match

This worked perfectly fine, other that I have created a global variable in php to actually use that on all urls.
There is one article file where all articles are loaded. I also added another regex btw, which is the id of the article, so that there are no duplicates of the same article title.
I learned a lot from you and also didn’t know escapes were used inside regular expressions.
Last but not least, I have to say that for what I need, this file is fine and after your explanation I get a way better idea of .htaccess.
One thing I truly hate is not commenting out the code, but this is only when the file is already finished.
This is why I didn’t comment anything on the code.
Also, from what I understand I should not use NC flag on my RewriteRule and that was so stupid cuz I already specified a-zA-Z, lol.
I don’t know if this .htaccess file is safe for production purposes.

Oh, Alex! You’re so close yet SO far from the correct code!

First, when you tested the $2 (title) in the URI, did you TRY to break it or simply use a valid title that you had memorized? If you use CaMeL CaSe, I’ll bet that the redirection will work (you can see that if you use R=301 — for TESTING only) but I’ll bet article.inc.php will throw an error when making a query to your database. Okay, the NC flag will not be the culprit but it will also have the same affect as the [A-Za-z] character range definition.

Second, the No Case flag is inherently WRONG when applied to a case sensitive variable ({REQUEST_URI} is case sensitive). Again, it’s designed to be used when someone CaMeL CaSeS your {HTTP_HOST} (attempting to break your mod_rewrite when attacking your server?).

Third, your index => index.php (you’re STILL describing it backward) redirection will also redirect garbage = garbage.php. Is that REALLY the kind of redirection you want (for index => index.php)? I think not but, of course, it’s up to you.

Remember, computers are LOGIC machines which make decisions based on right or wrong. They are NOT phychic and canNOT guess at your intentions. Give it clear and concise directions and you’ll never have any problems (with your coding).

Regards,

DK

I thought of doing an if else statement from php.
When someone attempts to get something from the database that doesn’t exist, it will redirect to 404.
(for example article/blablabla/blablabla)
I added a define variable in php to always point the root url and I add it in all the links.
And yes, I wanted to completely hide .php and .inc.php.
Also, I removed the [NC], so it works perfect.
The only concern I have is if the code is safe for production or it might have something that crackers can bypass.

Hi Alex,

Are you fully “cleansing” the input before allowing it to touch your database? THAT is where you have the greatest chance of a hacker getting into your system.

mod_rewrite is a good first step as you’re defending against non-digital $1 values but $2 can be problematic: Either list the acceptable values (IMHO, that gets ridiculous too quickly) or simply limit the $2 values as you have done ( [a-zA-Z_-]+ ) BUT use a standard format (all lowercase, Title_Case, or UPPERCASE) which PHP can easily enforce. Assuming you have spaces in your titles, replace them with an unique character (underscore is GREAT for this) for your links and easy to restore to spaces before executing your query.

404? I redirect to the Home page (because it has all the valid links readily available). Providing information as to the visitor’s error, though, aides hackers in determining what they need to change for their attack to be successful so I avoid that.

I’m not sure what you mean nor when you would “point the root” (NEVER point to the server’s root so be careful with your redirection).

IMHO, I hide my .php file extensions, too, but am not paranoid about it because it’s easy enough for a hacker to determine the requested file’s name. One member asked years ago about redirecting FROM the .php file request to the extensionless format but then serving the .php file after all. Loopy … but it can be done (look for “Loopy” at http://dk.co.nz/seo). That’s a bit more advanced but you appear to be nearly ready for that step (assuming it would help easy your paranoia). :laughing:

I’m glad that you now understand the No Case flag.

Production safe? My advice for the FIRST things to look at when coding is above; if done well, they will provide most of the protection you will need. Just remember that Security is a three edge sword: Cost (for you) vs Convenience (for users) vs Privacy (of your data, database, code, etc.). Be paranoid but only up to the point that you feel it’s crippling you.

Regards,

DK

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.