Data Security: 5 Steps for 2007

More than 25% of the critical data held by Fortune 1,000 companies is flawed— inaccurate, incomplete or duplicated, according to a release issued by Gartner last month.

But what is perhaps more shocking than knowing that a quarter of all the information kept by the largest financial, insurance, manufacturing, pharmaceutical and energy companies is wrong, is that the overall state of data quality hasn’t improved in the last two years—Gartner reported the same statistic two years ago—and the research firm doesn’t expect much if any improvement in the next two years.

So, what can business do to clean it up?

Two years ago, Baseline talked to a dozen data experts, asking them how corporations could improve the quality of their data. After the most recent Gartner report, we went back to many of those experts and asked them to review our list of recommendations. Some phoned in; others replied by e-mail. While they added a tweak here and there, their original advice pretty much stood the test of time. Here’s the update:

ACKNOWLEDGE THE PROBLEM. The first step in any recovery is first admitting there is a problem. And data quality experts say every company has bad information.

DETERMINE THE EXTENT OF THE PROBLEM. There are data cleansing software tools that can monitor and count the number of files coming into a system and the omissions, duplications and typos within each. Companies should also conduct accuracy assessments to test the validity of their data, says Larry English, president of information quality consultancy Information Impact International. For instance, a company could pull up a sales file and call a set of customers to check their names, addresses, etc. English says he’s aware of one company that did an assessment on the marital status of 2,000 customers and found more than 20% of the data inaccurate.

PUT A DOLLAR FIGURE ON IT. For a large company, Tom Redman, president of data quality consultancy Navesink, has said the tab for software, services, and any corporate reengineering needed to improve data collection and processing operations could run up to $1 million. That may sound like a lot of money, but it’s not when you think about the expense companies incur because of bad data: customer turnover, missed sales opportunities, and bad budgeting, planning and distribution are included by Gartner as the price companies pay for data flaws. English even puts a real number on this, saying companies can lose 15% or more of their revenue due to problems stemming from bad information.

PUT SOMEBODY IN CHARGE. Redman and English both say companies should create a management position focused on data quality (Redman calls it chief data officer; English refers to it as chief information quality officer). The person filling this role would be responsible for measuring data accuracy, bringing in tools to keep it clean and, most important, putting in place processes to ensure that data is correct when entered at its source—by redesigning data input tasks and changing worker habits.

USE AVAILABLE TOOLS. Companies should employ root-cause analysis tools to find out what’s causing their data flaws and to define process improvements. Then use data cleansing tools to fix inaccuracies. Better yet: master data management, by which a company consolidates data into a single repository where it can be assessed, cleaned and verified. An even more elaborate solution is what Gartner vice president Ted Friedman calls a “data quality firewall”— data cleansing and other tools between a database and information sources to check data as it enters a repository.

Companies should also try to prevent flaws from being entered into systems in the first place by designing enterprise- strength information models and databases with clear and correct definition, according to English. Quality, shared databases, he says, serve as a single source of data creation and they control data movement, minimizing data transfer and the potential for introducing new errors.

But the only real way to improve data quality is with a concerted effort. As Gartner’s Friedman puts it: “Improvement in data quality requires a combination of people, process and technology focus. All three aspects must be brought to bear in order to achieve lasting positive results.”