By Baselinemag  |  Posted 2006-06-12 Print this article Print

The auto data aggregator ditched its mainframe, spending more than $20 million to build a data factory. Was it worth it?

The Road Test

The new system, according to Vasconi, has delivered on Polk's expectations. It's cheaper to maintain—close to the company's original goal of cutting maintenance costs by 50%, he says—and faster at processing data, although Vasconi couldn't provide specific metrics to back up that claim.

First, he says, the initial acquisition costs for hardware and software (which he wouldn't disclose) were 40% lower than buying a comparable amount of IBM mainframe processing power. Plus, Polk's ongoing maintenance fees—to vendors including Dell, Tibco, Oracle, Informatica and DataFlux—will be less than what it has paid to IBM.

An even bigger area of savings for Polk: The Data Factory has let the company reduce head count in the data operations group by 43%, from 56 to 32, Walker says. Mainly, the reduction in staff was possible because many manual steps in the process have been automated. "We've eliminated scores and scores of manual touches," he says.

Vasconi, expanding on the factory metaphor, compares the new system to a manufacturing assembly line that uses robots to put components together. Humans sit in a glass-lined booth and only intervene when something goes wrong. With the mainframe system, workers were needed on the factory floor to push levers and buttons. Some business processes, Vasconi says, "were just broken." For example, administrators would have to check for vehicle registration data that arrived from the various states before sending it through the system; that's now automated.

Also, with the new system, Polk can catch any data-processing errors earlier in the process, reducing the need to rerun an entire data processing job. By using DataFlux's data-quality analysis software, operators can identify anomalies—say, an unusually low number of sales in a particular state, which could indicate an error—earlier in the process. In a batch-processing mainframe environment, "you don't have the ability to stop the batch in mid-process and do a quality check," Vasconi explains. "If it was wrong, you'd have to run it all over again and find out where in the 50 steps along the way the data anomaly occurred."

That efficiency has also allowed Polk to reduce the time it takes to turn raw data into a product available to customers by more than 50%, Vasconi claims. He doesn't have an overall average of the improvement. However, with the previous system, data sometimes just sat around for days so that it could be grouped into batch-processing jobs. The Data Factory eliminates that waiting period.

"We've taken multi-hour processes down to multi-minute processes," Vasconi says.

As for what Polk would have done differently, Walker says the company probably should have taken more time in the initial planning stage. The accelerated time line—planning took less than six months—most likely increased the price tag for the project, Walker believes, because additional contractors were required to supplement the work of RLPTechnologies' full-time staff. "That's just a gut feel," he notes. "But I think if we'd gone just a little slower, it would have cost less."

Vasconi, though, says speed was of the essence for the project. "Internal initiatives tend not to move with passion and singular focus and inventiveness," he says. "We felt we needed a sense of urgency."


Submit a Comment

Loading Comments...
eWeek eWeek

Have the latest technology news and resources emailed to you everyday.