Comments on web pages can be an interesting and valuable channel for communication with your customers, but they can also be a source of polluting spam.

Recently I've been getting quite a few almost real looking comments that are actually bait for hacked web sites.

What can we do about this...

Initial web site defenses

As a web site owner, I don't want to allow inappropriate comments to appear; these might be pure spam[1], offensive or illegal statements or misleading links.

You could restrict comments to registered users, but that requires either some review process before approving any user or some monitoring of posts to identify spammers, trolls [2], sockpupptets [3] and any other members of the undesirable internet bestiary.

CAPTCHA system [4][5] are commonly used as pre-filters - these are the extra questions or processes that prove you're a human (but see [6]) and take various forms like "what are these characters" or "solve this maths problem". Captcha systems are a Good Thing in general, although not without problems.

Ultimately, some junk still gets through.

Moderation in all things

A moderation method, where comments have to be reviewed and/or altered by the web site owner is the gold standard for filtering out the rubbish.

Except it's not enough to just check that the comment isn't advertising male enhancement pharmaceutical products...

An Example

I recently received this notification about a new comment:

With my suspicious hat on:

  • I suppose Huey could be a German name, but it's more likely that the name and email address is faked (it might be a real email, but probably isn't from that person)
  • The IP address is in Canada - again Huey might be travelling, but it's more likely that someone's PC has been hijacked.
  • And then there's the content: generic waffle that could apply to any page on any site with a bit of a compliment to encourage me to click the link.

Well, before we click that link, let’s see what Mr Google has to say about that url:

Oh dear. Not a good sign [7].

In a fit of enthusiasm, shall we see what's there anyway:

Rats.
My excitement has diminished sufficiently that I'm not about to see what lies beyond the redirection.

What to do about it

As a web site owner, have some pre-filtering mechanisms in place, but don't just reliy on the automated systems to filter out the bad guys. There is still no substitute for human intelligence.

As a commenter, understand that sometimes your genuine contribution might get caught up with a bunch of junk and get missed. Try not to take it personally.

References

1 Email spam.

2 Internet Trolls

3 Internet Sockpuppets

4 Captcha systems

5 Proof of work systems

6 Dogs using the internet

7 What the 'This site might be hacked' message from Google means.