Adding anti-spam to contact form

laflair13 · July 21, 2012, 8:35pm

Hey All,

I have searched and searched and I am sure I overlooked it somewhere.

I am wanting to put an anti-spam thing on my contact form.

Something like 2 + 4 =

Then they have to put the answer (6) in the blank or the form will not send.

Is there anywhere I can read how to do this?

Heres what I have for the form:

<form name="htmlform" id="contactus" method="post" action="mail.php">
<table width="450px">
</tr>
<tr>
 <td valign="top">
  <label for="full_name">Full Name *</label>
 </td>
 <td valign="top">
  <input  type="text" name="full_name" maxlength="50" size="30">
 </td>
</tr>
 
<tr>
 <td valign="top">
  <label for="company">Company *</label>
 </td>
 <td valign="top">
  <input  type="text" name="company" maxlength="50" size="30">
 </td>
</tr>
<tr>
 <td valign="top">
  <label for="email">Email Address *</label>
 </td>
 <td valign="top">
  <input  type="text" name="email" maxlength="80" size="30">
 </td>
 
</tr>
<tr>
 <td valign="top">
  <label for="telephone">Telephone Number</label>
 </td>
 <td valign="top">
  <input  type="text" name="telephone" maxlength="30" size="30">
 </td>
</tr>
<tr>
 <td valign="top">
  <label for="website">Website Address</label>
 </td>
 <td valign="top">
  <input  type="text" name="website" maxlength="80" size="30">
 </td>
</tr>
<tr>
 <td valign="top">
  <label for="comments">Comments *</label>
 </td>
 <td valign="top">
  <textarea  name="comments" maxlength="1000" cols="25" rows="6"></textarea>
 </td>
 
</tr>
<tr>
 <td colspan="2" style="text-align:center">
  <input type="submit" value="Submit">
 </td>
</tr>
</table>
</form>

ralphm · July 22, 2012, 11:08am

Hi David. We had an interesting discusion about this recently, where it was felt that a better method is to put a hidden timer field on the form:

Anyhow, some interesting solutions there. Either way, a field like you are suggesting is best hidden via CSS, to trap spam bots into filling it in. You can set the field to abort the form if anything is put into the field other than the answer. (It’s called a “honeypot”.)

Do any of your other form fields have protection on them?

Markdidj · July 22, 2012, 12:43pm

I like that ralph, I’ve never even come across at the honeypot method. I haven’t had any spam get through for ages either, but it’s a good method to use to reduce work backend.

I use things like making sure username is not the same as message header or contents, making sure there is no double posts, ip filtering (but check ip address first before adding), word filtering.

I’ve also created a clean & block, which enables me to copy and paste a bit of the message, add it to an input box which then goes to the server, adds the word or phrase to a text file and deletes all message that contain it. The text file is imported to the regular expression that validates user messages.

It’s working well, and that’s without the honeypot, which I’m going to add next

ralphm · July 22, 2012, 1:49pm

Sounds interesting. In the thread I linked to, it was pointed out that a problem with the honeypot is that some users may see it (like those on screen readers or with CSS off etc., meaning that they should really get some kind of instruction with an allowable answer). The alternative was felgall’s timer method, which disallows the submission if posted within a few seconds of page load—perhaps a typical scenario for a bot. That’s what I’m going for next, as it leaves legitimate users along completely.

Markdidj · July 22, 2012, 2:44pm

I doubt that bots run javascript, and you can’t rely on it for people without. So how would you get a time diference between when a user gets the message and when they sent it back. You do not have anything reliable enough both server side or client. Not that you could depend on enough for all legitimate messages.

If you let the server set the “server sent form time”, you can’t rely on javascript to modify it. Neither can you rely on it to set it. There is also nothing reliable enough server side to cross reference sent and received time with.

With the validation I use, the text-file, replace(text,vbcrlf,“|”) or str_replace(“/r”,“|”,$text), goes straight into the regular expression, so I end up with (spam|pawn|i\'ve a large mouse|make your didgeridoo bigger), and makes the text file easy to update or sort alphabetically. The errors that come from the validation have a list of what was wrong. Here’s what it’s outputting right now. (been about 4 weeks the flagged.txt has been added)

178.137.202.244|2012-07-09 14:44:49|-|User name must not be the same as message title or message contents|–|Email address not valid|–|Your message contained a blocked word or URL|-
46.119.115.22|2012-07-05 21:23:35|-|Your message contained a blocked word or URL|-
188.143.232.211|2012-06-26 17:19:06|-|User name must not be the same as message title or message contents|–|Email address not valid|–|Your message contained a blocked word or URL|-
188.143.232.211|2012-06-25 14:51:31|-|User name must not be the same as message title or message contents|–|Email address not valid|–|Your message contained a blocked word or URL|-
188.143.232.211|2012-06-24 15:59:37|-|User name must not be the same as message title or message contents|–|Email address not valid|–|Your message contained a blocked word or URL|-
188.143.232.211|2012-06-24 12:00:21|-|User name must not be the same as message title or message contents|–|Email address not valid|–|Your message contained a blocked word or URL|-
91.207.7.254|2012-06-23 20:53:39|-|Your message contained a blocked word or URL|-
91.232.96.17|2012-06-23 18:13:18|-|User name must not be the same as message title or message contents|–|Your message contained a blocked word or URL|-
91.207.7.254|2012-06-20 11:53:00|-|Your message contained a blocked word or URL|-
188.143.234.19|2012-06-20 08:35:58|-|Your message contained a blocked word or URL|-
80.7.200.231|2012-06-16 20:09:12|-|Message title must not be the same as the message contents|–|User name must not be the same as message title or message contents|–|Your message contained a blocked word or URL|-
80.7.200.231|2012-06-16 20:03:45|-|Message title must not be the same as the message contents|–|User name must not be the same as message title or message contents|–|Your message contained a blocked word or URL|-
80.7.200.231|2012-06-16 20:03:33|-|Message title must not be the same as the message contents|–|User name must not be the same as message title or message contents|–|Your message contained a blocked word or URL|-
80.7.200.231|2012-06-16 20:00:24|-|Message title must not be the same as the message contents|–|User name must not be the same as message title or message contents|–|Your message contained a blocked word or URL|-

As you can see most spam bots are triggering it multiple times. From this file, named flagged.txt, I can get the IP addresses, look for repeating offenders and block their

It might need relaxing a bit, as I get very little spam, if any, but my site isn’t busy enough to tell. From this I should be able to work it out though

ralphm · July 22, 2012, 11:13pm

In the thread I linked to in post #2, a solution for this was proposed which seems pretty good to me. What do you think of it?

Markdidj · July 22, 2012, 11:49pm

To get round that form I would just send a timestamp of time-8, as they’re looking for date difference greater than 7. Or set the loadtime to epoch time. And it’s relying on javascript, which I thought was a no-no for accessibilty.
I don’t think you can use a minimum or maximum time, as you’d need to rely on javascript, cookies, sessions or some other state to cross reference it. None are used by everyone.

ralphm · July 23, 2012, 2:27am

Is it? It seems all PHP to me.

Markdidj · July 23, 2012, 3:31am

Oh yes, sorry, but either way it doesn’t matter. I inject the epoch time every time, sending 100 at a time, and all are through. No need for calculations as it’ll always come out true. Or I set an 8 second interval between receiving and sending. They do like a puzzle, spammers, and I enjoy the battle as I have had one on my case for quite a while, until recently. I’m sure they were manually looking at my guestbook to notice new changes.

The only things I can’t stop now are the gibberish ones. Be great to have a piece of regexp to detect them but not acronyms.

cheesedude · July 23, 2012, 7:46pm

Any “honeypot” method can be easily defeated. If you want a free and easy-to-use Captcha that you can install with only a few lines of code, take a look at the Securimage PHP Captcha.

http://www.phpcaptcha.org/

I’ve been using it for 3 years now. It works great.

Markdidj · July 24, 2012, 7:04am

Nope, nothing got through. I didn’t see a cookie warning, so maybe I blocked them on a random visit I made before. Good job I didn’t want to actually leave a message

Cups · July 24, 2012, 8:35am

Stopping spammy form submissions seems to fall into two main categories;

a) stopping bots
b) stopping humans who are hired to send you spammy data

There is not much you can do about b) unless you limit your submissions by IP and or Country and for a) there are Turing tests (did a human fill in this form?)

Use Advanced Search here, search for the term “turing” and constrain results to the PHP forum for some conversations over the years.