How Clean Data Can Transform Your Business

By Paul A. Strassmann  |  Posted 2006-07-06 Print this article Print

Before you integrate isolated systems, you must have clean data. That's not as easy as you might think.

You've heard the buzzwords: service-oriented architecture and enterprise service bus. But those terms refer only to the technology, not the critical steps that companies must take to maximize the promise of business transformation, e.g., freeing their systems from isolated applications and migrating them to highly reliable and immediately available services.

Transformation involves building new systems that resemble existing applications and will ultimately displace them, without a large increase in I.T. spending. The transformed systems must have reliability, integrity, security, economy, ease of use and Google-like response. More important, though, the data in use by these applications--such as customer or transaction records—must be consistent or else transformation will not take advantage of the new services. Click here to calculate how much dirty data is costing your business

The U.S. Department of Defense, for one, is on a transformation march. Consider this: The projected fiscal year 2007 information-technology budget of the DOD is $30.5 billion. This includes 273 major systems development projects in logistics, 189 major systems projects in personnel administration, and 255 in finance and administration. Each of these projects will spawn multiple applications and several databases.

Problem is, most of the 273 logistics applications will not readily communicate with each other. They were conceived at different times, using different technologies and data formats. The same is true about personnel, finance and administration, as well as warfare applications.

Application A must have data available for application B for it to function, but cannot get it except through a slow translator C. Application B also has data essential for application E, but cannot display it instantly because A and E have inconsistent data definitions.

Those are the very same problems that grabbed headlines in The 9/11 Commission Report. If the Department of Defense is unable to connect the dots between multiple sources of information, it cannot effectively wage war on terror.

Many commercial firms fail to realize the full potential of customer information, history of transactions and other business data. There are now at least 10 global firms with annual I.T. budgets in excess of $3 billion, and hundreds with technology budgets that top $300 million. Many of these firms suffer the same blockages as the Department of Defense. Insufficient integration will be reflected in excessive costs for maintaining incompatible systems, in huge delays for fielding innovations, and in the delivery of low-quality results. You cannot compete successfully in global commerce unless you can interrelate business information from multiple sources.

The first step in business transformation: enterprisewide standardization of data. That calls for the declaration of a metadata directory as the template for defining data that can circulate within a firm's information systems. The policy and implementation of an enforceable metadata directory likely will be resisted by bureaucrats, who see this as a threat to their indispensability. It will not be welcomed by systems developers, contractors and vendors, who prefer to concentrate on upgrading software as a technologically more interesting—and profitable—task.

To reach agreement on the representation, semantics and taxonomy of data, you will likely go through a painful political process that must be adjudicated by line management. This can get messy because it will reveal that a large percentage of installed software perpetuates incompatible, unreliable, insufficiently secure and delayed information.

Once an organization has defined its metadata for both internal and external communications, it may begin cleansing data files to eliminate non-conforming data, or deploy filtering translators that will allow legacy applications to co-exist with the new solutions.

Transformation without data cleansing will inhibit interoperability and preserve the existence of thousands of fragile lines of code that each application must host to solve its unique data control requirements. Without data cleansing, you will need to deploy workers to manually fix errors as customer demands for high-quality information services grow.

You can create a business case to justify measures to improve the quality and accuracy of data. For one example, see "The

Price of Dirty Data." Keep in mind, though, that you will need to invest in a good pair of marching boots before you head down the transformation path.

Paul A. Strassmann is delivering a series of lectures on information economics at George Mason University. He is also a consultant to the Department of Defense. You can contact him at pstrassm@gmu.edu.

Paul Strassman created and trademarked the Information Value-Added and Information Productivity formulas behind the Baseline 500 rankings. His career in technology, which began in 1956, includes stints as a top information-technology executive at Xerox, General Foods, Kraft, the Department of Defense and NASA.

Strassman is president of The Information Economics Press and senior advisor to Science Applications International Corp., he is also Distinguished Professor of Information Sciences at George Mason University's School of Information Technology and Engineering.

He has written numerous articles and books on information management, including Information Payoff: The Transformation of Work in the Electronic Age (1985) and The Squandered Computer (1998).

Submit a Comment

Loading Comments...
eWeek eWeek

Have the latest technology news and resources emailed to you everyday.