I’m trying to cut down on spammers who keep making trashy requests to my site using different IPs per-each request. The basic access log entry pattern that I’m seeing from these is as follows:
<IP ADDRESS> - - [<DATE / TIMESTAMP>] "GET /?q=node/add HTTP/1.1" 403 5507 "<WEBSITE>" "Mozilla/5.0 (Windows NT 6.3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36"
Nine times out of ten, these requests consistently use the above flavor of Webkit / Safari user agent and they always use a different IP address, thereby making it a whack-a-mole situation that can’t be fixed by an entire subnet block or something like that. I’m assuming it’s some sort of botnet trying to spread malware or spam?
What I tried to do is the following:
RewriteCond %{HTTP_COOKIE} !cookievar
RewriteCond %{REQUEST_FILENAME} \.(gif|jpe?g|png|js|css|swf|php|ico|txt|pdf|xml)$ [NC]
RewriteRule .* - [L,co=cookievar:true:%{HTTP:Host}:86400]
RewriteCond %{HTTP_COOKIE} !cookievar
RewriteCond %{THE_REQUEST} (user\/register|node\/add)
RewriteRule .* - [F]
I’m not very great with HTACCESS code (as you may or may not tell from the above) but my intentions here were to force any browser coming to the site to store a cookie value if they can access my assets, then I would use that cookie to validate if the visitor is an actual user. If they pass that, I let them through and onto the website. Otherwise, I stop them before they can use any server resources. It’s my understanding that blocking a user at the HTACCESS level is akin to stopping them at the app server level (and not the app itself).
Unfortunately, my logs indicate that it’s not working like I was hoping it would and I’m hoping that someone on here might know why? What I’d love to do is block all requests that can’t store my cookie (who make GET requests to user/register or node/add) at the HTACCESS level, this way their constant visits don’t sap up any server resources.
Insights would be appreciated.