Determining if all requests will read both .htaccess files

My application has the following structure:

…/public_html/ (application root)
…/public_html/www_root/ (my webroot folder containing all images, css, js etc along with my front controller index.php)

I have an .htaccess file in both of these folders. My first .htaccess file is a lot more comprehensive regarding rules etc but essentially will forward all requests to my www_root/ folder:


<IfModule mod_rewrite.c>

    # Follow all symbolic links but do not index directories
    Options +FollowSymlinks -Indexes

    # Turn the rewrite engine on
    RewriteEngine On

    # Turn the server signature off
    ServerSignature Off

    # Protect against DOS attacks by limiting file upload size
    LimitRequestBody 10240000

    # Do not allow access to any .ht file
    <FilesMatch "^\\.ht">
        Order allow,deny
        Deny from all
        Satisfy All
    </FilesMatch>

    # Protect the php.ini file:
    <files *.ini>
        order allow,deny
        deny from all
    </files>

    # Ensure all URL's is processed using the www. (SEO, Not for development purposes)
    RewriteCond %{HTTP_HOST} !^www\\. [NC] [OR]
    RewriteCond %{HTTP_HOST} !^localhost [NC] [OR]
    RewriteCond %{HTTP_HOST} !^testdomain.com [NC]
    RewriteRule ^ http://www.%{HTTP_HOST}%{REQUEST_URI} [L,R=301]

    #dont send the apache error document, but rather send the not found to index.php
    ErrorDocument 404 index.php

    # Send a forbidden request to some known web spiders
    RewriteCond %{HTTP_USER_AGENT} ^Anarchie [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ASPSeek [OR]
    RewriteCond %{HTTP_USER_AGENT} ^attach [OR]
    RewriteCond %{HTTP_USER_AGENT} ^autoemailspider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Xaldon\\ WebSpider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Xenu [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Zeus
    RewriteRule ^.* - [F,L]
   
    # Rewrite all empty requests to www_root/
    RewriteRule    ^$ www_root/    [L]

    # Rewrite all requests except selected filenames to www_root
    RewriteCond %{REQUEST_URI} !^.*\\.(gif|jpg|png|ico|css|js|swf|wav|mp3|less|cur)$
    RewriteRule    (.*) www_root/$1 [L]
</IfModule>

<IfModule !mod_rewrite.c>
    <IfModule mod_alias.c>
        # When mod_rewrite is not available, we instruct a temporary redirect of
        # the startpage to the front controller explicitly so that the website
        # and the generated links can still be used.
        RedirectMatch 302 ^/$ /index.php/
        # RedirectTemp cannot be used instead
    </IfModule>
</IfModule>

My .htaccess file in my www_root/ folder sends all requests to the index.php file:


<IfModule mod_rewrite.c>
    RewriteEngine On
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteRule ^(.*)$ index.php [QSA,L]
</IfModule>
<IfModule !mod_rewrite.c>
    <IfModule mod_alias.c>
        # When mod_rewrite is not available, we instruct a temporary redirect of
        # the startpage to the front controller explicitly so that the website
        # and the generated links can still be used.
        RedirectMatch 302 ^/$ /index.php/
        # RedirectTemp cannot be used instead
    </IfModule>
</IfModule>

Now as my front controller resides in the www_root/ folder (…/public_html/www_root/index.php) it essentially is now the base path of all html requests, so when I have something like <img src=‘images/someimage.jpg’ />, will that request process both my .htaccess files or will it only process the .htaccess file located in my www_root/ folder? The reason being is that I do not want to duplicate all rules in both .htaccess files, but fear that not doing that may cause all “html” requests to not read or be filtered with the rules in my application root folder?

Also, not knowing better, is there a better or more efficient way to achieve the above? I have collected these rules over years so I am not sure if some of them are overkill :slight_smile:

Apache will do its best to combine the directives from both htaccess files, but if the same kind of directives appear in both files, then those directives will override one another. That means the rewrite rules in /public_html/www_root/.htaccess will override the rewrite rules in /public_html/.htaccess. Which also means that your “www” rule likely won’t work. The spiders rule likely won’t work. It also won’t exclude certain file extensions. Though, in this case that’s OK, because your /public_html/www_root/.htaccess does the same thing but in a better way, by excluding all real files (!-f).

To avoid these problems, you’ll have to keep all your rewrite rules in one place, probably in the /public_html/.htaccess.

NV,

Because you have stated that your application is outside your “webspace,” it will be irrelevant to your web requests.

NORMALLY, (you say it’s not your setup) your webspace would start at public_html (normally the DocumentRoot), then Apache will read and parse EVERY .htaccess in the path, IN TURN, to the requested file. Therefore, your public_html’s .htaccess must complete its processing before handing off to www_root’s .htaccess. Both .htaccess files would be read, parsed and effect any redirection before any file is served. With this understanding, you can now put yourself in the place of the server and look for matches in your code remembering that any redirection will restart the process (from the DocumentRoot).

As for your use of <IfModule> wrappers, I have a standard rant to deal with this server abuse:

[rant #4][indent]The definition of an idiot is someone who repeatedly does the same thing expecting a different result. Asking Apache to confirm the existence of ANY module with an <IfModule> … </IfModule> wrapper is the same thing in the webmaster world. DON’T BE AN IDIOT! If you don’t know whether a module is enabled, run the test ONCE then REMOVE the wrapper as it is EXTREMELY wasteful of Apache’s resources (and should NEVER be allowed on a shared server).[/indent][/rant 4]

You might benefit from reading the mod_rewrite tutorial linked in my signature as it contains explanations and sample code. It’s helped may members and should help you, too.!

Regards,

DK

I have to add the caveat that anyone who learns from this tutorial will have to unlearn some incorrect information later on. I strongly urge everyone to learn instead from the official documentation.

It actually happens in the reverse order. www_root’s htaccess first, then public_html, and on up. See Apache HTTP Server Documentation: How directives are applied.

:nono: More bad info (and flame) from JM. :nono:

Members, look through older posts in this board and see how many I have helped with accurate information … since 2006.

Regards,

DK

No doubt there have been plenty of occasions where you’ve helped people using perfectly accurate information. But everyone makes mistakes from time to time, and the issue now is that you don’t acknowledge or correct your mistakes. In fact, you double down on them. Even when the issue is black and white – that is, clearly spelled out by the documentation and backed by simple, repeatable tests – you still refuse to acknowledge a mistake and instead you insist on repeating it. The quickest and easiest way to move past a mistake is to own up to it.

JM,

Too true! I’ve got years of experience with mod_rewrite starting on Apache 1.x and extending through 2.x and have always corrected my mistakes. When you have the experience and knowledge, will you …? In the mean time, please don’t continue promote incorrect information.

End of discussion (I don’t have the time to waste in silly arguments).

DK