E-mail Protection from Spambots

First published on 2005-11-06, updated on 2008-04-25.

You can hide e-mail addresses from spammers whilst everyone else still sees it.

The Technology

Spambots are computer programs which crawl the web, looking for e-mail addresses. Much like the way search engines look for interesting content.

Spambots are pretty dumb. E-mail addresses have a simple format, so spambots tend to be pretty dumb. In contrast, web browsers must be very smart to support HTML, CSS, scripting, streaming video and audio.

This technology gap can be used to only hide e-mail address from dumb spambots.

Munging or Obfuscating with Entities or NCRs

In No Spam, Please by Philip Semanchuk, he found that a new e-mail address received 815 spam messages during the first 213 days of it becoming available on a web page. But using a feature of HTML to encode the address resulted in just 2 spam messages over the same timeframe.

In HTML, the commercial “at” symbol can be represented as plain text like this: @. It can also be encoded as a numeric character reference (NCR) like this: @. Spambots only understand plain text but web browsers understand both.

Normal HTML

cerbera@projectcerbera.com

Hexadecimal & Decimal NCRs

cerbera@projectcerbera.com

NCRs & Bogus Markup

<span class="project">c&#x65;rbera</span>&#64;<span class="cerbera">pro&#x6a;&#101;&#99;&#116;</span><span class="project">c&#101;&#114;be&#114;&#97;</span>&#x2E;c&#111;m

Server-side Spam Filtering

If your address is already receiving spam, using entities now is like locking the stable door after the horse has bolted. However, there is effective software to filter spam. I use Spam Assassin from Apache. Some e-mail clients and services offer spam filtering.

Filtering is not faultless as sometimes spam will get through while geniune e-mails will not. The default setup for Spam Assassin works very well for me, though.

Analysis

Simple encoding lets genuine users see the address but not the spambots.

Recommendation