Gotcha! Extending Existing Systems

Did you know that:

If you want to do more with the information in your existing systems, you are opening a can of worms

If you want to do more than scrape data off the screen of your older applications and put a Web face on them, you’ll have to go back into the code and adapt it.

That means making modifications and extensions to the original software to allow data to be directly accessed through what Gartner Inc. research director Dale Vecchio calls a “modern” architecture such as Microsoft’s .NET framework or Sun’s competing SunOne platform, which create ways for disparate computer systems to communicate and interact online. Once you’ve opened the code, you may find yourself facing a complete rewrite of the software to get it to work right. Very often, you won’t see bugs that will pop up, because documentation is poor or missing.

If all you want to do is present information onscreen in the same way as in the past, products from IBM and other, smaller vendors—including ClientSoft, Jacada and Seagull Software—will help you with the move to the Web.

“It’s fairly quick, and low cost. But it’s limited to the information that’s presented in terminals,” says Vecchio, “if you’re not willing to open up Pandora’s Box and go in and change the [original] code.”

When you put on a new face, performance and reliability might actually get worse

Even the scraping approach means adding complexity: You’ll need a new interface, and new middleware to connect it to the underlying applications.

The translation of user instructions as a result can be slow—or nonexistent, if there’s a hole in the middleware. The new approach will only be as reliable as the weakest link in the layers of software.

“Are you sure you can make it as fast as the current system?” Vecchio asks.

Software written for desktop computers and the Web is rarely as reliable as the original mainframe application. Making data available over the Web means more points of failure, including Web servers, users’ computers and all the network points in between.

You’ve got to watch out for semantics

If you try to integrate older applications by using middle- ware and data integration, there’s another major hazard—semantics, or the meaning of data.

The meaning can vary from one legacy application to another, even if it appears to be identical. For example, in the case of a Toyota Motor Sales’ data integration project, according to quality manager John Gonzales, the data field “VIN,” for vehicle identification number, had different meanings across the company’s 20 databases—even though the data in many cases had the same structure.

“If you don’t pay attention to data semantics, you’ll spend a lot of time fixing problems later,” Vecchio says.

The FBI faced this problem in the late 1990s as it attempted to integrate disparate fingerprinting systems into its giant Integrated Automated Fingerprint Identification System.

Fingerprints used by the FBI’s National Crime Information Center data store were in a different format and presented at a different resolution than those in the fingerprint identification system. The FBI was forced to develop a custom means of exchanging fingerprint data between the two systems.

The most pain will come from figuring out how and where data will be united

Short of dumping existing applications in favor of standardized software, the only way to resolve many of the inconsistencies in the data in older systems may be to aggregate it all in a central repository, a.k.a. warehouse. But to do that, you’ll need to resolve all the relationships between disparate databases and any differences in data semantics beforehand, by building a single data model.

Even if you’re not using a data warehouse, and instead accessing data through middleware directly from its source, you’ll still need to do most of the work required for a data model, mapping out how the data from various sources fits together. You’re essentially done.

But that’s not a trivial task. Toyota Motor Sales took six months to get a unified model together for all its customer data across 20 different databases. And those databases were built on widely supported commercial software. Systems like the FBI’s IAFIS and NCIC are single-purpose systems that depend largely on custom-built databases, which makes the integration process more expensive.