How to Watch Data Leaving Intelligently

By Kevin Fogarty  |  Posted 2008-07-21

It's hard enough to protect sensitive data within a large corporation when that data is well defined and easy to locate. Think customers' credit card numbers, Social Security numbers and other customer data, or internal documents like human resources files on employees, tax and other finance documents.
But sensitive data isn't always arranged in easy to recognize patterns the way credit-card or Social-Security numbers are, according to Alex Tosheff vice president and chief information security officer for online credit-processing company, which allows customers to shop online using information that identifies them personally, but doesn’t involve a credit card or any account numbers at all.”

"We went through a long process of classifying our data, and create policies around it and got buy-in from the people we need to, and educated people about how we were actually classifying data, and had tools to keep that stuff from getting out [across the Internet]," Tosheff says.

"But we still knew we were probably communicating things we shouldn't be," he says. "Things that were going out in [instant messaging] or through other channels people used because it was easier to do their jobs that way, but that were probably not the channels we would have chosen."

Tosheff addressed the problem with a set of data-loss prevention products from Reconnex. Rather than going out on the Internet to search documents, Reconnex products search all the data on an internal company's network and, more importantly, search and index all the data going out of the company through e-mail or other formats, through the company firewall.

The products, which are built into a special-purpose appliance that's designed to be dropped onto a network builds an index of all the information a company communicates through the Internet, no matter in what format it's sent.

Tosheff declined to identify the kinds of data he discovered leaking through his company's firewall, specifically because it was sensitive.

"There were a number of cases where we had people sending out information that would be OK, usually, except the were doing it through IM or other channels that weren't as secure as they should be," he says.

But it's not surprising that sensitive data will leak, even from companies that rely heavily on structured data and don't have much trouble defining what kinds of data need to be protected, according to Deven Bhatt, chief security officer of airline of Airlines Reporting Corporation (ARC), a clearing house for flight data and ticket purchases that serves more than 150 airlines and train companies as well as thousands of travel agencies and other partners.

"Of course you have your structured data, credit cards and [Personally Identifiable Information] and you can build filters for that," Bhatt says. "But this appliance approach can help you scan your hard drives and data on your networks and know what data you have and who is using it, and then how it's leaving your organization.”

For example, with this technology you can place a monitoring agent on, say, a customer database or other competitive, sensitive information that watches the flow of data and the profile that is accessing it. This agent sees the activity of data and whose system it was being moved to including mobile devices that are plugged in to the network, such as iPods, iPhones, USB drives, or other easily transportable systems.

“After the discovery you have to discover [know] what people are doing with that information because once it leaves your organization, that is the highest risk, because you have no further control over it," says Bhatt.

Tosheff and Bhatt both use the Reconnex product to help the company demonstrate its compliance with security rules, but also to enforce internal rules or policies about the appropriate use of company networks and computers. During tax season, for example, many ARC employees email tax documents either from work or through the same Internet connections they use to access servers at the office.

"So when we see that, we tell people, 'You know, you're sending very personal information across the Internet in the clear,' and they're very grateful for the warning," Bhatt says.

"It also lets people know that you really are monitoring what they're doing on the Internet the way you told them you were in the classes where you educate them about what's appropriate use of technology or handling sensitive data," Tosheff says.

The Reconnex products come in two primary parts – an x86-based server with all the required software already loaded, and a console IT managers can use to configure the monitor and gather reports or more ad hoc data. The process of finding and identifying types of data going through the firewall depends on sophisticated data-mining algorithms.

It works essentially like a search engine, sitting on its custom-designed server, filtering data as it goes out through the firewall and building an index of what data is being sent, and by whom. It uses IP addresses, LDAP and Active Directory profiles to identify specific users and connect them with the data they sent. The real configuration issue has nothing to do with the appliance, according to Tosheff and Bhatt. The real work is in creating an internal listing of what data the company owns, where it's stored and – most important – what specific types of data are sensitive.

"Every company has some kind of intellectual property they'd like to protect, but it's not always as clear as credit card numbers or other data with patterns that are easy to define," Bhatt says. "You have to go to each of your business units and ask them what kind of information they have and what would be the cost if they lost it, or if someone outside were to get ahold of it."

Without a fairly rigorous idea of what kinds of data  you need to protect, it's impossible to build a query to discover how much of a risk the company faces, Tosheff says. It's possible to just read through the index and logs, but that's inefficient and arbitrary, Tosheff says.

The only way to get a good handle on data leakage is to put filters and logs on every exit point from the network and use tools such as Reconnex' s to keep track of the data employees are sending out, and the communications channels they use to do it, Tosheff says. Doing that with a large company could require installing a whole series of Reconnex appliances to avoid having to route all data through one gateway, Tosheff says.

The Reconnex appliance starts at $19,000 for a basic installation, and costs can rise to several hundred thousand, depending on the amount of data and number of points on the network that should be guarded. Most companies would want to have more than one appliance, Bhatt says, one on each Internet gateway is the best way to get the most value and the best security, he says. That means spending another $19,000 for every gateway, and potentially more money on consulting help from Reconnex, which helps customers set up processes they can use to inventory their existing data, profile data customers consider sensitive, and to set flags and automated reports to notify security managers when security policies are violated.

"You don't want to put all your guards on one door, then leave the other unguarded," Tosheff says. "You need to take a more architectural approach."