I want to get a better understanding of security issues in the processing of contact forms, so that I can ensure that I'm doing everything that I could conceivably need to do while I work on my form's PHP script. I had first better list what I'm doing so far:
- Converting to HTML entities and trimming spaces
- Checking for any unfamiliar array keys in POST (inc select options)
- ...for things like "content-type", "bcc:", etc.
- ...that the max length of inputs are as they should be
- ...each input with some regular expressions
- ...for common spam keywords
- ...and for URLs in the message
Reading that, it looks like I might know what I'm doing, but my knowledge is very patchy and I've had to spend a lot of time learning as a go along. Some of what I've added to my script may even be unnecessary or ineffective, possibly. Understanding PHP has been particularly difficult. Anyway, the questions:
Leaving aside spam for a minute, exactly what would a hacker enter into a form to try something malicious? What should I include in the regular expression that checks for such data (e.g. 3rd item in above list)? At the moment, I have a regex that I saw on another forum somewhere and it looks like this:
And what .htaccess tricks should I look out for and use? I have a few, mostly from perishablepress.com, but again, a thorough checklist would be very helpful to my research/learning.
My site doesn't have a database, so at least that's one less thing to worry about. But what can a hacker do in other ways? (I assume that there's quite a lot of things).
The majority of web form spam is looking to do 1 of 2 things:
For a contact form there's no need to reinvent the wheel if you aren't confident in your php abilities - use a library such as swiftmailer that has already taken care of security precautions.
I would never use someone else's script to process my form. I have tried that in the past and I never found one that would suit my demands, high among which is usability. I've spent the last few months learning how to build and fine-tune my own script because it's the only way to get one that really works properly and behaves exactly as I expect.
I would do a combination of both.
I would use my limited knowledge to analyse various pre-mades, and if any didn't look right either scrap or hack.
That's what I used to do, way back when I didn't know enough to write my own PHP. I had to expend enough effort on hacking 3rd party scripts that I had to give up because the result wasn't what I wanted and it was easier, by that point, to start from scratch.
But this is a bit off-topic; I just want to know if I have missed any validation methods/security tricks/etc. that would be worth using. Searching Google for advice leaves me thinking that few people have bothered to write in detail about this topic.
I seem to remember writing an article on this years ago, but I may not have posted it online anywhere. (I can't find it.)
Most contact forms solicit four things:
The second item is what most attacks target. Because the mail() function in PHP requires the From header to be passed in its raw form, it presents the best opportunity for exploit. I suggest two steps for making sure an email address is valid:
- Use ctype_print() to make sure there are no non-printable characters in the email address. This is the least you can do.
- Use a reliable pattern to test the format of the email address. I like to use this one: RFC 822 Email Address Parser in PHP
The other things you're doing are fine as defense in depth, but these two things should be at the top of your list.
Hope that's helpful.
Is there a reason you don't use Wufoo? Wufoo: Online Form Builder - Create Web Forms & Surveys
They handle all of this spam and security for you.
I have never heard of "Wufoo" and would not know what it is. I wanted to do it all myself and learn by doing (and reading and asking questions, of course). Someone else's way of doing things is, in my experience, unlikely to suit me. I am extremely keen on accessibility while a lot of "professional" solutions, for various tasks, are not as accessible as they should be.
So I tend to bypass such things now for those reasons. I don't have time constraints or clients to please, either, of course.
By the way; thanks to "shiflett" for your earlier reply, which I hadn't seen.
Going back to your original question, here are a couple of things a hacker can try to do with a contact form:
1) Cross Site Scripting attack on you if you are viewing messages through a browser
2) Cross Site Scripting attack on your users if the message is displayed on the site somewhere (given you don't have a db, I don't think this a concern)
3) Email SPAM by various methods
4) Crash your web server
There are a lot of other things one can do depending on how the forms are implemented on the client and server side.
I know you already went there in other posts but still, for security sake you are better off with a commercial or community driven solution then writing your own. This advice is even stronger for dealing with SPAM.
That being said, the basic security methodology talks about white list + black list - accept only known good and block known bads.
For the name and title make sure you do not accept any special characters and escaping is a good practice (black list). From the white list approach you can probably generate a robust regular expression for how a valid name and title should look like (white list).
For the email, in addition to the patterns you described (black list), I would also add a regular expression for a standard email address for the many examples out there (white list).
For the message, again stripping off special characters and escaping is a good practice.
All the above suggestions do not consider your email client which expects to receive input in a specific format. You should make sure you do not break anything by passing input that is not appropriate.
Without seeing your specific implementation this is the best I can do.
Hope it helps.