Htaccess Causes 403

I cannot get to the bottom of this.
Currently, if I enter mydomain/forum or /forum/ I get a 403. Within the forum directory I have this .htaccess:

<IfModule mod_rewrite.c>
Options -MultiViews +FollowSymLinks
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule \.(jpeg|jpg|gif|png)$ /forum/public/404.php [NC,L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /forum/index.php [L]
</IfModule>

And within the parent directory I have this .htaccess:

Options FollowSymLinks -Indexes
  DirectorySlash Off

# Apache Rewrite Rules
<IfModule mod_rewrite.c>
  RewriteEngine On
  #RewriteOptions AllowNoSlash
  RewriteBase /

# 1. Redirect www to non-www ✓
  RewriteCond %{HTTP_HOST} ^www\.(.*)$ [NC]
  # If the host contains www, disregarding case, redirect to a non-www URL
  RewriteRule ^(.*)$ http://%1/$1 [R,L]

# 2. Redirect trailing slash to non-trailing ✓
  RewriteCond %{ENV:REDIRECT_LOOP} !1
  RewriteRule ^(.+[^/])/$ http://%{HTTP_HOST}/$1 [R,L]
  # Pattern: One or more characters that don't end in a slash, all ending in a slash

# Redirect /index.php or /index.html to non ✓
  RewriteCond %{REQUEST_URI} /index\.(html|php)(\?.*)?$ [NC]
  #If the request URI contains /index.html or php at the end, remove it and redirect the browser
  RewriteRule ^(.*)/index\.(html|php)(\?.*)?$ $1 [NC,R,NE,L]

# Redirect /dir/foo.(php|html) to /dir/foo ✓
  RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s([^.]+)\.(html|php) [NC]
  RewriteRule .* %1 [R,NC,L]

# 3. Make it so trailing slashes aren't required for directories ✓
  RewriteCond %{REQUEST_FILENAME}/ -d
  RewriteCond %{REQUEST_FILENAME} !-f
  #If it isn't a file but would be a directory
  RewriteCond %{REQUEST_URI} !/$
  #If it doesn't end in a slash already
  RewriteRule ^(.*)$ $1/ [E=LOOP:1]
  #Give it a slash internally, set the environment loop variable to 1 to prevent rewrite 2 from creating an infinite loop

# 4. Make it so URL doesn't require .php
  RewriteCond %{REQUEST_FILENAME} !-d
  # If the requested filename is not a directory
  RewriteCond %{REQUEST_FILENAME}\.php -f
  # If the requested filename is a valid php file, serve it to the browser.
  RewriteRule ^([^.]+)$ $1\.php

# 4. Make it so URL doesn't require .html
# Input would be draw.academy/article
# Output would be draw.academy/article.html, but only internally
  RewriteCond %{REQUEST_FILENAME} !-d
  RewriteCond %{REQUEST_FILENAME}\.html -f
  RewriteRule ^([^.]+)$ $1\.html

# 5. Serve /learn from /
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_FILENAME} !-d
  #If it isn't already a file or directory
  RewriteCond %{REQUEST_URI} !^/learn/?$
  #If it isn't the learn directory itself...
  RewriteCond %{DOCUMENT_ROOT}/learn/$1\.php -f
  #ONLY if the file would exist in the learn directory. This preserves the 404 function.
  RewriteRule ^(.*)$ /learn/$1
  #Serve the file as if it were the in the learn directory

# 6. Serve /includes from /
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_FILENAME} !-d
  #If it isn't already a file or directory
  RewriteCond %{REQUEST_URI} !^/includes/?$
  #If it isn't the includes directory itself...
  RewriteCond %{DOCUMENT_ROOT}/includes/$1\.php -f
  #ONLY if the file would exist in the includes directory. This preserves the 404 function.
  RewriteRule ^(.*)$ /includes/$1
  #Serve the file as if it were the in the includes directory

# 7. Prevent hotlinking and show logo instead

#  RewriteCond %{HTTP_REFERER} !^$
#  RewriteCond %{HTTP_REFERER} !^http(s)?://(www\.)?directdrawing.com [NC]
#  RewriteCond %{HTTP_REFERER} !^http(s)?://(www\.)?sentientoak.com [NC]
#  RewriteCond %{HTTP_REFERER} !^http(s)?://(www\.)?(\w+\.)?draw.academy [NC]
#  RewriteRule \.(jpg|jpeg|png|gif)$ http://i.imgur.com/r9bAyps.png [NC,R,L]
</IfModule>

Finally, if I enter /forum/index.php and anything after, it works fine. What gives??

Hi,

It may be because your .htaccess file is ignored. See here for more information on this - http://kb.bodhost.com/htaccess-file-ignored-now-what/

Joe,

Elizine may be correct but there is a way to check that very quickly. The code is contained in my mod_rewrite tutorial at http://dk.co.nz/seo and that tutorial has helped many SitePoint members over many years.

[begin rant] As for your code, do you really think the result of running a test repeatedly within microseconds will cause the result to change? Test ONCE and then get rid of the IfModule wrappers. That is a pet peeve of mine as it serves no useful purpose (other than to protect developers from being accused of destroying someone’s website who tries to use mod_rewrite when it’s not been enabled). The microseconds you are wasting on EACH PASS through .htaccess add up to lots of time over the life of a website and I consider it abusive of the server. [end of rant]

Next, The two options you have in the first block should already be set as you have them (standard server configuration - MultiViews causes problems with filenames in the path and mod_rewrite can’t be enabled without the FollowSymLinks). Another wasted effort.

Next? You’ve repeated RewriteBase / in each of your blocks. Trying to make Apache dizzy? Personally, that resets the base of the mod_rewrite code in each block and only confuses me when I’m dealing with one block or the other (more difficult to find code conflicts). The simple fact that you’ve baselined to both DocumentRoot and /forum is absurd (IMHO).

Next is the No Case flag in a RewriteRule. Please know that URIs ARE case sensitive so use the NC flag where it’s designed to be used: RewriteCond statements looking at {HTTP_HOST} (which is NOT case sensitive).

At the end of the first block, you’re not accommodating a request for /forum/ (unless you have a DirectoryIndex set). Just make the dot character optional with a ?.

Your DocumentRoot’s .htaccess repeats the IfModule waste of server resources (remember, that’s a pet peeve of mine - yeah, yeah, how could you forget? :wink: ).

This RewriteBase is superfluous because you’re already in the DocumentRoot.

As a matter of style, I would have use R=301 (permanent redirection) so my visitor would know that he’s been redirected (visually, that’s what the external redirection does but a search engine may not see that).

The second rule set doesn’t make sense as I don’t see where you’re setting the REDIRECT_LOOP environmental variable. Of course, there is an easier way to remove a trailing slash (but you have bigger problems from what I see - just read the tutorial and check-out the codes on offer).

The third rule set is trying to match the separator between the {REQUEST_URI} and the {QUERY_STRING}. Don’t worry about it, you can NOT access that character.

I see that you’re trying to redirect to the index file without the extension being displayed. Rather than a long winded version of “OMG,” please refer to the “Direct to New Format” section of the tutorial. Ditto the foo code.

Sorry, I’ve run out of steam having gone this far and only see much of the same below. Just know that the tutorial also has code for other real world examples including rejecting hotlinking attempts.

If you have questions on the tutorial, you know where to find (or PM) me.

Sorry to be so brusque; it comes from too many years of the same questions and the tutorial used to be both a SitePoint article and in this board’s sticky threads. I am willing to help and hope that I have to some small degree already … provided I’ve not upset you with my ranting. If I have done that, I apologize!

Regards

DK

DK,

First I want to thank you very much for your in-depth explanation. I needed that rant, and the logic is so obvious it almost immediately became a pet peeve of mine. I apologize for the late reply, I broke my site and only got it working just now.

Extending that idea, every ifModule is a waste of resources after the initial check; so essentially none should be used, correct? And as long as we’re talking about cumulative microseconds, would an equivalent amount of time be saved by putting my directives in httpd.conf?

This is what I picked up from your tutorial and post. Let me know if I’m getting this right:

  • In http://site.com/forum/index.php, the REQUEST_URI would be forum/index.php, and in http://anothersite.com/foo/bar/image.png it would be foo/bar/image.png. Am I getting that right?
  • So RewriteRule can only work with REQUEST_URI. But RewriteCond can use all of them.
  • Domain names aren’t case sensitive, but URIs are (because files are case sensitive as well). This is why [NC] wouldn’t make sense in a RewriteRule.
  • External redirects will display in the browser, but if I am redirecting anything other than temporarily it should always be a 301.

2 and 3 are a pair for removing slashes cosmetically but serving as if the slash were there. Here’s what it’s been changed to:

# 2. Redirect trailing slash to non-trailing ✓
  RewriteCond %{ENV:LOOP} !1
  RewriteRule ^(.+[^/])/$ $1 [R,L]
  # Pattern: One or more characters that don't end in a slash, all ending in a slash
    
# 3. Make it so trailing slashes aren't required for directories ✓
  RewriteCond %{REQUEST_FILENAME}/ -d
  RewriteCond %{REQUEST_FILENAME} !-f
  #If it isn't a file but would be a directory
  RewriteCond %{REQUEST_URI} !/$
  #If it doesn't end in a slash already
  RewriteRule ^(.*)$ $1/ [E=LOOP:1]
  #Give it a slash internally, set the environment loop variable to 1 to prevent rewrite 2 from creating an infinite loop

I’ve removed the RewriteBases and IfModules. I will try to rewrite my rules better, but in the meantime here is the error from my RewriteLog:

init rewrite engine with requested uri /forum
(1) pass through /forum
(2) init rewrite engine with requested uri /includes/403.php
(1) pass through /includes/403.php

The htaccess is not being ignored, AllowOverride is All, and every other rewrite for every other scenario works. I cannot for the life of my figure out why Apache would choose to just pass /forum through. Even a nonexistant like /iii is at least processed once.

Hi Joe,

Thanks for taking the rant as intended: to drive home the point. Congrats, too, on recognizing the problem and making it a pet peeve of your own!

Should ever be used? Yes, but only if you are a code developer for others who may not have mod_rewrite enabled. IMHO, in that case, it’s okay but should ONLY be included AFTER a test is made and a 500 error is thrown for unrecognized mod_rewrite code (that would be the result on a server without mod_rewrite and mod_rewrite code in the .htaccess file). In short, comment out any in code you receive, run a text and be ready to uncomment if a 500 error is received. Thanks for allowing me to explain more fully.

Questions:

  1. Correct! Of course, mod_rewrite’s job is to change the {REQUEST_URI} string and, if 301’d, display that change, too.

  2. Correct again! However, think of RewriteRule as the redirection associated with a BLOCK of code which can include 0 - {hopefully few) RewriteCond statements which will be ANDed (or ORed) with the trailing RewriteRule.

  3. Correct! However, the NC flag in a RewriteRule can MISdirect because of incorrect caplitalization (mod_speling is designed to eliminate that problem but is rarely available on production servers). Therefore, my recommendation is to ONLY use NC with {HTTP_HOST} variables.

  4. Correct again (four for four)! mod_rewrite redirections are 302 (temporary) by default so, IF you want to display internal redirections (absolute or relative), then the 301 is required. Of course, not so with external redirections as you stated.

Your treatment of the trailing / for directories leaves me cold. Okay, I rely on Apache to add the trailing slash (technically, it’s incorrect to remove it and “silly” to add it back … which Apache would do for you) AND, if not done correctly (as you ARE doing - albeit I would use the {IS_SUBREQ} rather than creating another Apache variable to do the same thing), it could make your relative links point to the wrong subdirectory level. On the other hand, “if it ain’t broke, don’t fix it!”

The log raises a couple of questions:

  1. If the .htaccess shown in the DocumentRoot?

  2. Directory structure - is fourm a subdirectory (and NOT a file) in the DocumentRoot?

  3. I believe 403 is an “access not allowed” code so that implies an incorrect permission set within the forum.

Regards,

DK

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.