Quite a while back I touched on an experiment I was doing with various methods to control spam coming from web based contact forms without using CAPTCHA. With the news that Gmail's CAPTCHA had been broken new conversation was sparked with myself and some friends. I brought up the test I was doing and how successful they had been, then I realized I never really covered it here.
My experiment has proven, to date, very successful. I have a small collection of sites out there using these methods, that had previously been inundated with spam for a long time. Users had resorted to server side spam controls, which had marginally benefit, or too many false positives, or inbox filters which still forced users to eventually filter through them.
I tested what amounted to a multi-tiered system that caught obvious kinds of spam. Spam through contact forms generally promotes something. You can almost count on a few things being in the messages. HTML, bbCode (custom HTML tags for message forums) or URL's. Procedurally you can scam messages and count how many of each there are and set limits to how many URL's or HTML/bbCode tags you are willing to let through before flagging as spam. This alone seemed to catch the majority of the spam.
The second was a timer. A hidden field on the form with a timestamp, upon submission compare that timestamp with the current timestamp. If the original timestamp doesn't exist, or, is only a millesecond behind, you know it's an automated submission.
Lastly was a reasonably old trick, but a good one. The text input field hidden by CSS, which, if upon submission, has anything in it, you know it was an automated bot just squirting stuff in any field it sees. If it's empty, let it go, if it's not, flag it. This method all by itself didn't seem to catch that much based on the tests I ran.
On the whole, mixing and matching these has virtually eliminated spam coming through web sites for at least one school district I know of using it, a few web site owners I know have helped me in the tests, and a few web sites I run personally.
It proves, a little creativity and pattern busting can beat spammers without CAPTCHA in a typical web site contact form. For spammers that are looking to crack a service such as Gmail and use it for further spamming, well, I'll leave that to the minds at Google, their spams obviously are not quite as predictable as the standard contact form spam, but they'll figure it out, I am sure.
Wednesday, February 27, 2008
My Anti-Spam Form Experiment Wrap-Up
Posted by
dB Masters
at
5:38 PM
Labels: Web Development
Subscribe to:
Post Comments (Atom)


1 comments:
Great tips! I have used #3 quite often (with great success), but know it's only a matter of time until that is no longer effective.
Thanks for some extra thoughts for the arsenal.
I guess ultimately the message is "Be Creative". Spammers/Hackers love standards.
Post a Comment