Tools: Data Analysis - Baseline
Home arrow Tools: Data Analysis arrow What Data Mining Can and Can't Do



Smarter Virtualization – Key Building Block for Dynamic Infrastructure
Turn Data into Results with Better Business Intelligence
Plan, Launch and Manage Your Data Centers More Efficiently









Renew Your Subscription

  Tools: Data Analysis


What Data Mining Can and Can't Do
By Allan Alter

  Table of Contents:
  1. What Data Mining Can and Can't Do
  2. ' The Risks of Data '
  3. ' Using Models to Solve '


Rate This Article:
Add This Article To:
What Data Mining Can and Can't Do
( Page 1 of 3 )

Peter Fader, Wharton's quantitative marketing wizard, has a message for CIOs: Stop collecting so much customer data, and stop misusing data mining.

Peter Fader, professor of marketing at University of Pennsylvania's Wharton School, is the ultimate marketing quant—a world-class, award-winning expert on using behavioral data in sales forecasting and customer relationship management. He's perhaps best known for his July 2000 (PDF) expert witness testimony before the U.S. District Court in San Francisco that Napster actually boosted music sales. (Napster was then the subject of an injunction for copyright infringement and other allegations brought against it by several major music companies.)

The energetic and engaging marketing professor has a pet peeve: He hates to see companies waste time and money collecting terabytes of customer data in attempts to make conclusions and predictions that simply can't be made. Fader has come up with an alternative, which he is researching and teaching: Complement data mining with probability models, which, he says, can be surprisingly simple to create. The following is an edited version of his conversation with CIO Insight Executive Editor Allan Alter.

Resource Library:

CIO INSIGHT: What are the strengths and weaknesses of data mining and business intelligence tools?

FADER: Data mining tools are very good for classification purposes, for trying to understand why one group of people is different from another. What makes some people good credit risks or bad credit risks? What makes people Republicans or Democrats? To do that kind of task, I can't think of anything better than data mining techniques, and I think it justifies some of the money that's spent on it. Another question that's really important isn't which bucket people fall into, but when will things occur? How long will it be until this prospect becomes a customer? How long until this customer makes the next purchase? So many of the questions we ask have a longitudinal nature, and I think in that area data mining is quite weak. Data mining is good at saying, will it happen or not, but it's not particularly good at saying when things will happen.

Data mining can be good for certain time-sensitive things, like is this retailer the kind that would probably order a particular product during the Christmas season. But when you want to make specific forecasts about what particular customers are likely to do in the future, not just which brand they're likely to buy next, you need different sets of tools. There's a tremendous amount of intractable randomness to people's behavior that can't be captured simply by collecting 600 different explanatory variables about the customer, which is what data mining is all about.

People keep thinking that if we collect more data, if we just understand more about customers, we can resolve all the uncertainty. It will never, ever work that way. The reasons people, say, drop one cell phone provider and switch to another are pretty much random. It happens for reasons that can't be captured in a data warehouse. It could be an argument with a spouse, it could be that a kid hurt his ankle in a ballgame so he needs to do something, it could be that he saw something on TV. Rather than trying to expand data warehouses, in some sense my view is to wave the white flag and say let's not even bother trying.



 
 
>>> More Tools: Data Analysis Articles          >>> More By Allan Alter
 


Sponsored Links
  • up.time Easily Monitors Virtual/Physical/Cloud. Free Trial.
  • Register for WES 2010 by February 19 and save $400.
  • Learn more about EnterpriseDB @ the Postgres Center
  • FREE Sophos Encryption Tool: Encrypt, compress and share files easily.
  • CDW Healthcare offers the IT solutions you need.
  • One number. One voicemail. Sprint Mobile Integration.
  • 12 Ways to Reduce Costs with SQL Server 2008.

     
  •  
    FEATURED SPONSORED MESSAGE

    FEATURED SPONSORED MESSAGE
       

     

    LATEST STORIES


     

     


    rss graphic
           Baseline Newsletters