Do you have any kind of CSRF protection in place?
Also, some bots are smarter some are dumber. If you use JavaScript on your page anyway, you can also make sure that some hidden field gets value depending on digest+entered data - something like:
md5(CSRFdigest+formfield1+formffield2);
And check on server side if correct values are present both in CSRFdigest field and in Javascript computed field.
Actually, I am dealing with form submission software, where it scans the internet for forms, stores the form vars, and then submits crap. In my case, every 3 minutes. So since this is form submission software I don’t think robots.txt can do anything for me. The bot has already come by.
It uses random proxies so I can’t block by IP. It has a user agent, but likely faked and made to appear common.
So that is why I am investigating the accept language thing I have heard about.
It is surprising how may bots are out there. Some bots pretend to be a valid search engine like Google and sneak into your website without you realizing it. Not all bots obey robots.txt either.
Not only that, but as elgumbo mentioned some use it to find out where you don’t want them to go and then go there. Set up a “honey pot” and you’ll catch some.
Don’t think of the robots.txt file as a security measure by any means.
If you’re worried about bots it’s worth looking at the code used by the well known wordpress plugin ‘bad behaviour’ (which can also be used out with the plugin) http://www.bad-behavior.ioerror.us/