Sunday, July 12, 2009

Spam Control

I've received a positive response to my recent posts about developing the website, so I'm going to continue that trend.

Today I wanted to quickly talk about something every web developer despises... SPAM. While most of us appreciate the tribute to Monty Python, we'd prefer not to have to deal with it, and we definitely don't want it cluttering up our website.

Spam comes in all forms: email, comments, posts, contact submission forms, even our requests page gets spam sometimes.

At SinlessLinks I have built in a lot of different ways to help cut down on Spam....

One of the most obvious is that anywhere there's a submission page, there's a captcha for the person to fill out. This has helped cut down on the spam tremendously. If a spammer has to do extra work to place their spam on your website, odds are they'll move on to someplace else. One thing you'll notice about our Captchas though, is that they're fairly easy to read, and they're extremely simple. I hold projects such as ReCaptcha in high regard for what they attempt to do, but I absolutely despise how hard it is to read and get the right answer to the captcha. I believe that a captcha shouldn't inconvenience your real users more than it has to. This was one of the big reasons for creating the user registration: our regular users can sign up and not have to deal with the evil captcha's any longer.

The downside to having such a simple captcha system, is that some spam still gets through, even from automated bots. So more needed to be done.

To take the spam prevention one step further, I created what is currently still a hard coded spam filter. It checks every comment against an array of various words that we've found only exist in spam comments. (Such as the names of various adult orientated medications.)

Another deterrent I put in place is "nofollow". Using rel="nofollow" in any links in the comments section helps to deter any spammers, as they won't get any page rank boost, nor will their pages get indexed faster by spamming us. You'll also notice that I don't use BB code. I personally believe that comments are secondary to the actual post, and therefore don't need to use special formatting such as italics and bold to get their point across. Spammers often use these techniques to draw attention to their post, and when their standard posts don't look good at all, they get frustrated and move on.


But alas, even with all these spam prevention techniques in place, we still get up to ten spam comments a day. Now that's not too bad, but Ribbitz and I are picky... we don't want any spam. So our foolproof method is to simply delete the spam. We have a "Recent Comments" page that about 20 admins have very easy access to. Between these 20 people, this page is checked in upwards of 100 times a day, and every spam comment found can easily be deleted by clicking a red "x" next to the comment. Beyond that, there's a "Ban" link right next to the red x. Clicking on the Ban link prevents that particular IP from posting anymore comments, or if they're logged in, it prevents that user from posting anymore comments. Yes we do run the risk of accidentally preventing legit users from posting comments, but so far we haven't run into any complaints of that sort.


From the whole process, I've learned that while you can work forever to create an automatic spam prevention system, the best method is to simply make it easy for devoted users to take care of the spam quickly and quietly.

~SinlessLinks

Here's a pic of just how easy it is for our Admins to take care of nefarious comments:




No comments: