How to get correct http status code in Apache logs?

I have the following rewrite rule in .htaccess that rejects site access for certain IPs:

RewriteEngine On RewriteCond %{REMOTE_ADDR} ^123\.123\.123\.123$ RewriteRule .* - [F,L]
The problem is that when there is a request from this IP then the Apache access log shows this as a successful request with status code 200 whereas the banned user actually got 403. The request is logged in the Apache error log (without the status code) but it’s very inconvenient to have those requests split into separate files. I’ve noticed this problem occurs only when I use mod_rewrite to send 403 like above.

Is there a way to make Apache log actual status codes into its access logs when the status code is defined in .htaccess?

L_J,

I find it difficult to believe that your code would render a 200 status when the request from 123.123.123.123 is Failed. On the other hand, if you say that it does, then it’s likely to be that the Last flag is not required here.

Note: I used to “preach” that the Last Flag is required to “terminate” a RewriteRule block - IN ERROR! Ditch that Last flag and try it again.

Regards,

DK

I would use mod_authz_host for this, since is was built for this specific purpose

order allow,deny
deny from 123.123.123.123
allow from all

The manual says Apache will raise a 403 for the banned IP. Don’t know about the log, you’d have to check that for yourself :slight_smile:

Thanks, I tried removing the last flag but it didn’t seem to change anything. I think I have a bigger mystery to solve because the correct 403 status is actually recorded in access_log when I ban myself and try to access the site. However, there are some bots visiting my site too often and whenever my automated system bans them by putting their IP into .htaccess then I don’t get the 403 status just after the ban - according to access_log they still manage to get a few pages (200 status) within several seconds and then they disappear, that is no longer access my site according to the log. This has happened a few times and I can’t really understand what is going on. Normally, after the ban such a bot should start accessing my site with 403 status and I should be able to see that in access_log - and it really did work this way. But it doesn’t work like this for most of my recent bots - there’s no trace of 403 anywhere in the logs, however the bans seem to be successful because the bots stop coming. 403 status is recorded when I ban myself so the system should be working fine…

That is a good idea, I’ll try it after I get a few explanations from the server admin. I get a large number of successful requests in the error_log so I suspect there may be something weird in the server setup.

Having said that, it may be that your automated ban list may have missed concurrent attempts to access your website before the ban could take effect. I don’t see any problem with this (from a webmaster point of view) but only because the 403 has been verified and is likely a permanent feature of your ban list.

Note: I’d implemented this exact sort of thing (modifying the .htaccess to ban an IP address) on a website but never saw the ban list increase (after my testing verified that it would work) so I have to give you kudos for your success in the production environment.

Regards,

DK

I don’t see how concurrent accesses could be the issue because each request is logged sequentially one after another and they are always at least a few microseconds apart.

I don’t know what you mean by “ban list increase” - my goal is not to increase the ban list but to decrease requests by banned IPs. My system is not complicated and seems to work well - every 5 minutes a cron job runs and counts the number of visits per each IP within the last 10 minutes. If an IP has more than 300 visits (which is one visit per two seconds on average) then it is added to the ban list to .htaccess for 24 hours. I log each request to a database table, which then makes it very easy to count visits later on, much easier and faster than parsing Apache logs.

Now back to the topic:
After closer examination of all access logs I have available I must admit I have misinterpreted the actual problem. Just after the bot is banned in .htaccess it doesn’t access the site with 200 status as I previously thought. Those 200 status entries in the access log were still before the ban took place - so nothing abnormal here. The moment the ban takes place there are no more requests from that IP either in access log nor in error log - the bot simply disappears. Because it is extremely unlikely that a bot would simply stop visiting my site just one second before it is banned I conclude that my Apache server simply does not log 403 requests from that bot at all. However, it logs 403 fine from other IPs, I even tried spoofing the UA (which is “Zend_Http_Client”) but still could not reproduce the failure to log.

So I have a different mystery to solve. Do you know what could prevent apache from saving requests to access log? I’ll be investigating this with my admin but maybe there are some obscure techniques of visiting servers without leaving any traces in access logs? Perhaps by sending special http headers or other trickery?