More than 25% of critical data in Fortune 1,000 databases is inaccurate or incomplete, according to research firm Gartner. Simply stated: Companies are basing key business decisions on faulty inventory tallies, erroneous financial information, inaccurate supplier data and incorrect employee records.
“In your average organization, there are a lot of assumptions that the data is accurate. And I think that very few organizations stop to actually look and measure it,” says Gartner vice president Ted Friedman. “The ones that do stop and measure are typically horrified by what they find.”
The result of bad data is diminished productivity, employee turnover, alienated customers and lost sales. Yet, say data quality experts such as Randy Bean, managing partner of consultancy NewVantage Partners and a former data manager at Harte-Hanks Data Technologies and Bank of Boston, many companies have no idea how accurate their data is nor do they have solid data quality improvement plans in place.
Baseline queried a dozen data experts on how to start and maintain a data quality initiative, including Friedman and Bean; Larry English, president of data quality consulting firm Information Impact; and Larissa Moss, a data management consultant with the Cutter Consortium. Here’s their collective, five-step plan.
1. Acknowledge the problem.
You might think your company doesn’t have a problem with the accuracy of information it keeps on customers, employees and business partners, but you might want to think again. Data accuracy, says Dana Rafiee, U.S. director for Destiny Corp., an international business and technology consulting company, “is a major issue in every organization.”
And if a company is going to improve the quality of its data, says Gartner’s Friedman, “The first step is admitting you have a problem.”
2. Determine the extent of the problem.
Companies need to get a handle on just how bad the problem is before they can figure out the best way to fix it, English says. There are typos, duplications and omissions, as well as inconsistent product codes and other variables, and those problems are compounded as files are collected, aggregated and merged.
Determining the extent of the problem on customer files, sales information and product specs starts with an assessment. There are software tools found in packages such as SAS’ Data Quality Solution and Firstlogic’s IQ Insight that can monitor and count the number of files coming into a system and the omissions, duplications and typos within each.
3. Establish the costs of getting it right—and the costs of getting it wrong.
How much does a data quality program cost? It depends on the size of the company, whether it operates in an information-intensive industry like banking or insurance, and the extent of the problem. But any solution is going to cost money. For a large company, says Tom Redman, president of Navesink, a data quality consultancy, the tab for software, services, and any corporate reengineering needed to improve data collection and processing operations could run up to $1 million.
The financial guys need to be convinced that there will be a return on data quality spending.
“It’s interesting to say we have only 60% completeness of our data. But that doesn’t mean anything to the CFO,” Friedman points out. “You have to put it in some quantified business terms. You have to say, because of that we burn an extra $10 million a month in expenses.”
English adds that companies without proper information management and control are already spending 10% or more of their operating revenue fixing problems that stem from bad data.
You can also quantify lost revenue from alienated customers, by correlating inaccuracies with reduced expenditures by people affected.
4. Use available tools.
Software can help fix many inaccuracies and prevent ordinary errors from occurring.
Data cleansing tools, such as Firstlogic’s Information Quality Suite or IBM’s Ascential software, can look at a piece of data like an address, check the address’ accuracy against a record, such as the U.S. Postal Service’s database, and correct it if it’s wrong.
Better yet: master data management, by which a company consolidates a set of data into a single repository where it can be assessed, cleaned and verified. This data store then becomes the “system of record” against which all other files are checked.
An even more elaborate solution is creating what Friedman calls a “data quality firewall.” Much like a security NGFW sits between a database and the source of the data and proactively checks information for errors as it enters the repository. The firewall uses data cleansing and other software tools to automatically fix problems according to the company’s requirements. Then, it creates a record of the data and compiles statistics on errors. “[It’s] putting the controls and measurements at the edges of the enterprise,” Friedman says.
5. Put somebody in charge.
You can’t automate accuracy. You must mind it.
Redman of Navesink says companies should create a position called Chief Data Officer. The person filling this role would be responsible for measuring data accuracy, bringing in tools to keep it clean and, most important, putting in place processes to ensure that data is correct when entered at its source—by redesigning data input tasks and changing worker habits. Or give the responsibilities to a top-level executive in either a key business or technology role.
This person, he says, “would help manage data as a business asset,” getting judged on lowering levels of errors found, customer complaints received, returned merchandise, refusals to make payments and the like.
Improving accuracy requires constant attention, Gartner’s Friedman explains. “Technology and process and people change,” he says. “All have to come to bear.”