Need some CAPTCHA help

Hi there.

On my website, I have a simple Guestbook where you leave feedback on the site. Recently, I have been getting comments written in Russian, and when I translate them in Google, it appears they’re advertising a porn site.

I’m new to the Web Development world and am of course new to this kind of web security. I’ve been to CAPTCHA.net but I don’t exactly know where to go when I’m there.

I’m not sure if I want the kind of system where it makes visitors read illegible characters. I’d rather have them name an animal, like a giraffe. I want a picture of a giraffe and I want humans to tell me what it is before they can post a comment. So how exactly would I go about doing this?

I’m using phpMyAdmin. Not sure if that matters.

Before you go down this route, ask yourself how a blind human is going to feel when you say—“you must state what is in this picture before proceeding”.

A better option is to have something like a simple question: “what is 2 + 2?” or “is water wet or dry?”

This kind of form filed is often called a “honeypot”. It can be a hidden field, thus designed primarily to trip up spam bots.

I know, “are giraffes tall”?

How would I do this exactly?

Hmmm, you’d be expecting the user to answer “Yes”, but what if they answer “Indeed”? :stuck_out_tongue:

How would I do this exactly?

How will you be setting up the other form fields? If you are able to set up the others, just create one more and perhaps hide it with display: none so that it will mostly be seen just by bots. Set a PHP rule to abort the form if anything but the right answer is inserted (just in case someone has CSS turned off and fills the field in).

If this is all double Dutch, you may need to hire a PHP developer.

I think you’ll find that question/answer type captcha’s are basically useless against someone determined to “break into” a website.

All they have to do is keep hitting your page to eventually get all the questions and then build a bot to provide the correct answer to the question it gets when it visits your page.

While it is true, that these questions offer almost no protection against attack focused on particular website, it is also true that vast majority of spammers do not really care about your particular site, they are spamming everywhere where they can get “their dirty bytes” onto. :wink:
So from my experience - even form that says:
<div>type word: “aquaflex” in field below</div><input type=“text” name=“just_checking” value=“”>
Keeps most of troublemakers at bay. The secret here is customization - if you are not a big target, then script writers for spam programs just won’t bother with you and move to easier target (which in your particular case is a good thing).

I’m not actually worried about my site being hacked for there really is no reason to (which I know doesn’t mean that they won’t). It’s mostly a bot advertising another website. I don’t want to have to manually delete them on a daily basis like I’ve had been the past few days.

I’ve recently seen one where it was a picture of a puppy, and it said to not worry if I type in another word for it if it means the same thing (like I suppose typing “dog” instead).

No, it’s not. What you’re describing is still a CAPTCHA – a Completely Automated Public Turing test to tell Computers and Humans Apart – just not a graphical one but a textual one.

A honeypot is a field that is hidden (either through display: none, visibility: hidden, { position: absolute; left: -999em;}, etc) that should not be filled in. Humans don’t have a problem with this, because they the field is hidden so they don’t even know it’s there, but spam bots work on the raw HTML and most of them don’t even parse CSS, so they will just fill it in.
Then on the back end side just discard the entry if the honeypot was filled in.
Another type is where you have a field filled in by javascript (based on the assumption that spam bots don’t run javascript), but that’s a little more obtrusive since people have javascript disabled.

That’s what I was talking about, so sorry if that wasn’t clear.

Problem with just relying on a honeypot being hidden, though, that it’s always possible that someone will see it, say with CSS off, or on a screen reader etc. So it’s best to make it a form field that people can do something with in this situation. (You could just say “don’t fill this in!”) but that’s a bit weak. (I actually had a client once who viewed the page with CSS off and freaked over this.) So these days I set the hidden honeypot to abort if the field is filled in with anything other than the one response prompted by the label. The form will only process if the field is empty or is filled in with the one allowable response.

So, anyone know of a good tutorial/article so I can finally make one?

Wat’s your level of experience? Do you know any PHP?

I wouldn’t be able to create anything big from scratch, but I can read and understand PHP code. I’m still a bit of a newbie, but I want to learn how to do it.

There are quite a few tutrials out there. Not sure how good these are, but here are some, picked at random:

PHP form tutorial

HTML - PHP Form Example

PHP Tutorials: Checkboxes

An easier option is to use something like Woofoo.

I’ll give them a look, thank you.

if you are using wordpress, try “Akismet” plugin. This plugin protects your site from spam comments.

For the CAPTCHA script that I’ve written, I’ve simply added ALT text to the image of “Type ‘iamblind’ in the field to the right”, and then set up the CAPTCHA test to either accept that, or the correct SHA1 hash of the submitted answer. This takes care of screen readers, and those with images disabled, but I’m also one of those who have a very hard time reading most CAPTCHA images that show distorted words, or jumbles of letters, so I created a different sort of CAPTCHA image test. The question text on the image are probably quite easy for a good OCR script to decipher, but the question is linked to a randomly generated image that the viewer is asked to describe with a single word answer. An example of this can be seen at http://www.geekcavecreations.com/captcha_new_concept.php and feedback is welcome. Right now, this is just “proof of concept”, and thus has limited (around 27 or so) possible permutations, but I intend on expanding it at some point in the near future. :slight_smile:

I too am not a captcha fan…it’s super hard to read! Based on the comments above, then, would it be safe to say that a viable alternative would be a php script that asks randomly generated questions such as:

“What is x + y?” where the x and y are random numbers between 1 and 10, generated by php and then checked?

If so, why are the Captcha’s so complicated?

<?php
require_once(‘recaptchalib.php’);
$publickey = “…”; // you got this from the signup page
echo recaptcha_get_html($publickey);
?>
Notice that the require_once function in the example above expects the recaptchalib.php to be in the SAME directory as your form file. If it is in another directory, add that in. This works just like liking to pages on your site. So, for example if your recaptchalib.php is in directory called “captcha” that is on the same level as your form file, the function will look like this:

require_once(‘captcha/recaptchalib.php’);
Configuring the Process File
We need to add the following code to the process.php file (or whatever file you are using to process the form). It has to be at the beginning of the file, before anything else, so keep that in mind (otherwise it’ll give you a warning about headers having been set already).

Notice that this code is asking for the private key, don’t confuse them, otherwise the captcha won’t work. You get that from the same page as the public key.

For More - Installing reCAPTCHA with PHP « inko9nito

Unfortunately, ReCAPTCHA is one of the most un-readable of the currently available CAPTCHA scripts. Almost invariably, one word is easy to decipher, while the other is a complete mystery. You see, I have a mild form of dyslexia that makes it hard enough to read normal text, let alone text that’s been obfuscated by visual distortion, or line occlusion. This is why my approach may be better for people like me, while at the same time make it more difficult for bot scripts to make sense of. When you take OCR out of the equation, then you remove one of the bot scripts’ best tools for cracking a CAPTCHA. I’m not certain how well the script I’ve written would hold up to a serious cracking attempt, but I would be interested in finding out.