Primer: Spam Filtering: The False Positive

By David F. Carr Print this article Print

Spam-blocking applications can screen out legitimate e-mail messages. A look at what can trip up your company.

What is it?
A legitimate e-mail that is not delivered because a spam filter incorrectly identifies it as junk mail.

Why is it a problem?
E-mail is an essential business tool, and there's a cost when it doesn't work as intended. For example, your company might use an application to generate order confirmations to customers. But a false positive can sidetrack a legitimate order.

How does it happen?
Messages are red-flagged in the spam-blocking applications used by companies and Internet-service providers to screen activity on incoming e-mail servers. A filter typically scans and scores each e-mail, blocking delivery of what it deems spam. A false positive results when a sender unwittingly includes enough of these red flags in a legitimate e-mail for it to be deemed spam.

How are spam scores determined?
Spam filters base scores on known spam techniques. Most filters work by parsing the headers, content and technical characteristics of e-mail, looking for specific indicators. One or two indicators alone don't usually earn a spam label, but if the filter identifies enough suspicious patterns—the presence of Hypertext Markup Language (HTML) or a suspicious server origin—the spam score is met and the e-mail is rejected. Anti-spam systems also keep blacklists of known spammers, as well as lists of approved senders. Most anti-spam systems keep their rules secret to prevent spammers from targeting them.

What characteristics cause problems?
Suppose you send a monthly HTML e-mail that contains an image tag pointing to a graphic on your Web server. A spam filter would likely flag it because it contains HTML and links to an image—making it look a lot like a common pornography advertisement. Other content indicators include ALL CAPS text, red font tags, huckster language like "pure profit" and even the word "remove." Spammers often misuse the seemingly benign "remove me from this list" offer to verify e-mail addresses and subject them to more spam. Spam-blockers also check technical characteristics on the theory that spammers typically have sloppy coding habits. A common technical red flag is when the "From" address doesn't match the header automatically added by the e-mail server of origin. That's a problem for a company that uses a third-party service to send e-mail that appears to be coming directly from its domain. Witness Tumbleweed Communications, which makes an anti-spam application and was chagrined to find that its product filtered out Web conference invitations it had sent to its own clients using the WebEx service.

I'm not selling Viagra. Why should I worry?
Because spam techniques keep changing. Your application that auto-generates e-mail can work fine one day and be flooded with return mail the next. Your staff needs to stay on top of spam-blocking updates and guard against false positives. The very fact that your company uses an automated Web script to generate e-mail may now be a red flag on your customers' spam filters.

This article was originally published on 2003-12-01
David F. Carr David F. Carr is the Technology Editor for Baseline Magazine, a Ziff Davis publication focused on information technology and its management, with an emphasis on measurable, bottom-line results. He wrote two of Baseline's cover stories focused on the role of technology in disaster recovery, one focused on the response to the tsunami in Indonesia and another on the City of New Orleans after Hurricane Katrina.David has been the author or co-author of many Baseline Case Dissections on corporate technology successes and failures (such as the role of Kmart's inept supply chain implementation in its decline versus Wal-Mart or the successful use of technology to create new market opportunities for office furniture maker Herman Miller). He has also written about the FAA's halting attempts to modernize air traffic control, and in 2003 he traveled to Sierra Leone and Liberia to report on the role of technology in United Nations peacekeeping.David joined Baseline prior to the launch of the magazine in 2001 and helped define popular elements of the magazine such as Gotcha!, which offers cautionary tales about technology pitfalls and how to avoid them.
eWeek eWeek

Have the latest technology news and resources emailed to you everyday.