How to Get Real-Time Analytics from a Data Warehouse
It’s a cliché in business to say that time is money, but the people who repeat it usually don’t quantify how much of one it costs to buy the other.
That’s probably a good thing, too, because the equation is changing. Time is getting cheaper.
Not radically cheaper: Saving time for other parts of the business still costs IT hard dollars for compute power, lines of code and bandwidth.
But the kind of technology that can crunch and deliver data quickly enough that its results are considered “real-time” has become so much less esoteric that it is now available to more than just brokerages, air-traffic controllers and emergency response agencies—organizations that would lose either their clients or their shirts if their information were more than a few seconds old.
Many of the companies moving into real-time systems are those for whom it’s a priority to measure time in seconds and to preserve as many as possible. Others are more surprising. Grocery products, for example, come with notoriously narrow profit margins, short shelf lives and complex buying patterns that have made after-the-fact data mining more successful than real-time sales tracking.
But near-real-time data allows grocery chain Haggen, Inc. to respond to low-inventory warnings as the inventory dwindles, rather than long after both inventory and prospective customers are gone.
“The software in our stores used to serve up summary files and log data at 3 a.m. We’d pull it across from all our stores by about 6 a.m. and begin the process of loading it into our data warehouse. By about 9 a.m., the business would begin to get some idea of what had gone on the previous day,” according to Harrison Lewis, CIO of the 33-store chain, which is based in Bellingham, Wash.
“Now we have a trickle-feed of data coming in all the time from the stores. Within fractions of a second after the transaction we can pull the data out, convert it to XML, send it to the controller and on to the server,” Lewis says. “We get visibility throughout the day; at 9 a.m. I can say how I was doing 15 minutes ago.”
That’s important for a store that focuses on high-quality, local products and—more importantly—products being mixed or baked or cooked in the store and sold for high margins in the company’s specialty departments.
Since we’re “dealing with products we’re manufacturing in the store, or with items that are doing incredibly well, having the ability to respond faster gives me the ability to take advantage of a good situation or minimize the impact of a bad one,” Lewis says.
That’s normally the kind of calculation that’s made over days or weeks in the grocery business, and one that’s difficult to justify with hard return-on-investment numbers, Lewis says. But it does give individual stores a greater ability to respond to local conditions at a pace competitors can’t match, he says.
Lewis uses real-time data integration software from GoldenGate Software, which uses a centralized data warehouse as a repository and distribution engine for online transaction processing data. Rather than process online transaction processing (OLTP) data in batches nightly or hourly, GoldenGate processes it continually, trickling transactions a few at a time into the data warehouse and disseminating it from there to other applications.
That way, according to Sami Akbay, vice president of product management and marketing for GoldenGate, the real-time system doesn’t burden the production system by polling it continually. The OLTP system operates normally, and all the business intelligence and other real-time applications pull data from the warehouse, which also cleanses the data to make it usable by other applications.
GoldenGate is far from the only vendor taking a similar approach. HiT Software, for example, pulls data off a production server and replicates it into an XML repository, allowing customers to run any analyses they need. Ascential Software (acquired by IBM in 2005), which began life as Informix software, also offers near real-time implementations.
Software in the whole category works the same way historical reporting and data mining does, according to reports from IDC business analytics analyst Dan Vesset.
A market survey released last month found that customers not only continue to buy more sophisticated analytics, but that they’re investing in both technology and human skills to expand the range of sophistication of their analyses and shorten the time it takes to get answers.
Near-real-time data analysis using data warehouses and other out-of-band approaches are a popular way to address the problem, the report says. The configuration is relatively simple, but it’s extremely fast, very reliable and far cheaper than systems with comparable results, according to Curt Miller, CTO of SinglePoint, which manages TV-Web marketing campaigns for customers such as Black Entertainment Television, Bravo, NBC and Fox.
“In TV-land, in many cases, especially on a live show, producers will do things that drive traffic to their Web site or through cell phones during the show,” Miller says. “What we had was a whole system geared for post-delivery reconciliation all of a sudden getting demands for immediate results.”
Results that SinglePoint had collated nightly suddenly had to be assembled and distributed continually, without trashing the Cognos data manager and Sybase database that formed the core of SinglePoint’s system.
“We jiggered our system with triggers so it would run every two hours, but it wasn’t something that would answer a question in five minutes,” Miller says. “We were able to point a mirror at that data and take the data out to the archive server and run the results against the mirror, so we aren’t hurting our production systems.”
Miller also declined to discuss cost, but says the GoldenGate software and integration services were comparable to any other kind of database work, not the highest-of-the-high costs that would be attached to traditional real-time data systems.
“Right now, when you respond to one of these shows, you vote and see a message go through your phone, you hear the beep and 15 seconds later it’s in the archive server,” Miller says. “Producers can see how they were doing not two hours ago; now they know how they were doing 10 seconds ago.”