I got hacked some time ago and the attack vector was to upload a javacript at the beginning of certain files. That code went to their website for the spamming code and was difficult to find (maldet was a massive help to track down all instances across subdomains [addons]). IMHO, allowing ANY HTML code is a major error and offers an easy avenue of attack. In other words, been there, done that and you don't want to make that any easier than hacking passwords (the presumed entry point).
Thanks for the response. Our site acts similar to craigslist where they make a post and it gets inserted into a section within the <body> of the page, so they wouldn't have access to the headers or beginning of the files.
I know ckeditor has some kind of filter, so maybe there is a way to single out and strip <script> code.
Why would it be bad to include HTML for styling the listing like eBay does, even when all <script> is removed?
Since <script> tags can be put within the <body>, the location makes no difference.
Do you want <div> and <iframe> tags, too? It's just too many ways to mess-up your page if you allow HTML tags.
For my clients, I've written something like your ckeditor but have it add code like SitePoint uses. Try making parts of your text bold or italic - that's what I've done and it works fine (safe).
If you can envision all the various nonsense that hackers can use (like encoding < as < or %3c), then MAYBE you can outguess ALL the hackers attempts to use your site as a launching pad for SPAM (and other exploits). IMHO, it's not worth the effort. Learn from SitePoint's code.
I agree. Instead of trying to figure out what tags to not allow, think about what to allow. i.e. put together a set of tags you think would be good to have, create bbcode tags for them, and limit users to those. Allowing HTML and/or CSS is an invitation to trouble and allowing script will certaimly bring problems.