Text Mining Tools: Don't Let Data Confuse You

By David F. Carr  |  Posted 2005-08-04 Print this article Print

Text mining tools can help a company avoid surprises—if they're deployed and used correctly.

Human analysts, not computers, bear most of the responsibility for spotting competitive threats, trends and opportunities. Technologists can arm them with "text mining" software that analyzes news stories, patent filings, customer-service notes and the like to find mentions of a company and its products and activities. The tools try to determine whether a company is seen in a negative or positive light, and to spot themes that might provide insight into what customers want the company to do next.

PROBLEM: The goal of discovering unexpected patterns through text mining can be elusive.

RESOLUTION: Take the time to properly define what you're looking for. Text mining aims to identify specific entities, such as people or customers, related facts and attributes, and events such as product launches or company acquisitions. But doing it effectively often requires a substantial investment in defining synonymous terms, such as scientific versus trade names for a given drug, or establishing relationships between products and companies or between parent companies and their subsidiaries or acquisitions (the startup that's suddenly relevant because it's been bought by your biggest competitor).

The "machine learning" capabilities of text mining software can accelerate this process by guessing at relationships—for example, by using computational linguistics to identify the relationship between a company and an action it has taken, such as acquiring another firm.

PROBLEM: Some forms of text mining, such as "sentiment mining," which focuses on identifying positive or negative comments posted online about your company or its products, don't work equally well.

RESOLUTION: Know the limits of the tools you choose to use. Sentiment mining vendors such as Intelliseek, with its BrandPulse suite and BlogPulse product, provide a way of continuously monitoring the online buzz about your company and its products. If you're Apple Computer, it's likely that a significant slice of your customer base is buzzing about the latest Mac or iPod product. But what if you're Mattel? Are there enough young girls keeping blogs for BlogPulse to discover meaningful trends in attitudes toward Barbie?

PROBLEM: The analysis often produces too much information for users to easily digest.

RESOLUTION: Provide analysts with a toolkit that contain both visualization software, such as mapping packages, and spreadsheet-like displays. For example, MicroPatent's Aureka ThemeScape, an analysis tool for patent data, will display the intersections between, say, a new chemical and its applications as if they were elements in a topographic map. Mountain peaks represent clusters of similar applications; isolated applications appear as islands.

PROBLEM: Computer analysis does not equal insight.

RESOLUTION: Don't just read the data, study it and talk about it. Leaders have a tendency to do what worked for them in the past, ignoring changes that render that strategy obsolete, according to competitive intelligence expert Ben Gilad, author of the book Business Blindspots. "Blind spots are immune to any text mining or visualization tool on the market," he says. Gilad advises firms that want to avoid this pitfall to establish an early warning system where executives and analysts meet twice a year to discuss what he terms "faint signals" of customer dissatisfaction and competitive threats.

David F. Carr David F. Carr is the Technology Editor for Baseline Magazine, a Ziff Davis publication focused on information technology and its management, with an emphasis on measurable, bottom-line results. He wrote two of Baseline's cover stories focused on the role of technology in disaster recovery, one focused on the response to the tsunami in Indonesia and another on the City of New Orleans after Hurricane Katrina.David has been the author or co-author of many Baseline Case Dissections on corporate technology successes and failures (such as the role of Kmart's inept supply chain implementation in its decline versus Wal-Mart or the successful use of technology to create new market opportunities for office furniture maker Herman Miller). He has also written about the FAA's halting attempts to modernize air traffic control, and in 2003 he traveled to Sierra Leone and Liberia to report on the role of technology in United Nations peacekeeping.David joined Baseline prior to the launch of the magazine in 2001 and helped define popular elements of the magazine such as Gotcha!, which offers cautionary tales about technology pitfalls and how to avoid them.

Submit a Comment

Loading Comments...
eWeek eWeek

Have the latest technology news and resources emailed to you everyday.