Keith Devens .com |
Saturday, July 4, 2009 | ![]() |
| I think I was basically a victim of what I like to call 'dumb people' -- people who are never going... – Joss Whedon (on the cancellation of Firefly) | ||
|
| ← Some useful-looking utilities | ongoing · A Phrase I Hate → |

Joseph Scott (http://joseph.randomnetworks.com/) wrote:
Dennis Pallett (http://www.phpit.net) wrote:
This is quite a difficult problem, and although your solution counters it somewhat, it's far from foolproof. I can't think of any other good way though.
Keith (http://keithdevens.com/) wrote:
They already have gone through the trouble.
Well, I figured someone probably has... I just meant that someone would be less likely to go through the trouble for my site (whereas if I was trying to implement something to, say, protect all WordPress blogs, all bets might be off).
Hmm... what about "randomized" field names? Where each field name is a hash of the real field name + the timestamp or something. Would that do anything at all? It's more trouble than it's worth (though actually not that hard with my library); I'm just trying to think outside the box.
mike macgirvin (http://baddcafe.com/mike) wrote:
I thought long and hard about this problem recently, and my own conclusion was to just require moderation on all comments. That's the best thing in the arsenal for small sites.
I was going to generate a GUID on the form which would only allow one post per form view. But then I'd have to store the GUID and deal with time-to-live issues. Not too hard, but moderation seemed to accomplish the result on my small site without any extra code. Your idea seems to be a cut above the GUID solution by adding the IP check, but along the same principles.
But it's relying a bit on obscurity, which I don't like to rely on. Once somebody knows the method, they can work around it. Hit the site a thousand times quickly to grab a collection of hidden variables and then post them back again just as quick. This would also get around your randomized field names. You can rate limit, but then spammer adds a sleep() call. It's an escalating war...
Keith (http://keithdevens.com/) wrote:
I won't do full moderation because it's giving up too much, particularly for the tiny amount of comment spam I actually get. I'd much rather have comment spam show up for a little while than make people wait to comment and kill any chance for spontaneous conversation.
It'd also be weird if, say, four people responded saying the same thing, but didn't know that they said the same thing because all the comments were moderated. Then when I unmoderated them, the latter 3 would look strange.
Once somebody knows the method, they can work around it.
Yeah, but it takes them work and that's the point. The harder you make it, the more people you'll weed out. You'll never weed out the most dedicated, or people who comment-spam by hand, but by raising the bar you'll lead people to go elsewhere. No one's going to waste time writing a form processor just for my site, for instance... their time is better spent hitting all the WordPress installs with default templates 
mike macgirvin (http://baddcafe.com/mike) wrote:
I understand your parameters... and given that environment, the hash should work fine. I was also recently contemplating a two-stage submit. I know several sites that have email verification for submissions (i.e. craigslist - which also uses captcha), but if you don't need that level of protection a two-click submit might do the trick. Hand back a hash when you get the form input, but wait until you get the hash back in a second POST to process it. This one has a much shorter lifespan than one where you have to wait for typing. Give them a minute or two to verify or toss it.
Just another outside-the-box approach.
mike macgirvin (http://baddcafe.com/mike) wrote:
Here's the word from WordPress
http://codex.wordpress.org/Combat_Comment_Spam
http://codex.wordpress.org/Plugins/Spam_Tools
I found particularly interesting the 'hashcash' tool - which creates a randomized form element and then uses client-side javascript to compute the md5 of that element and send it back. That's pretty darn effective and relatively hassle free, unlike captcha and email verify.
Keith (http://keithdevens.com/) wrote:
This has been implemented. It only took a few lines of code, and it's now (automatically) on every form on my site
View source if you'd like.
Keith (http://keithdevens.com/) wrote:
Just figured I'd update this... this technique has blocked hundreds of comment spams. Success! 
Oh, but it also blocked one guy whose ISP uses NetCache NetApp... software that breaks the web.
Feel free to post a comment below. Please see my comment policy.
Formatting Rules (No HTML):
Generated in about 0.346s.
(Used 8 db queries)
"... it's unlikely anyone would go through the trouble."
They already have gone through the trouble.