Amazon at the Forefront
The typical enterprise is awash in data. Virtually every industry sector—from health care and financial services to retail and manufacturing—is witnessing a growth in data volume at exponential rates.
Tucked away in these vast repositories is a gold mine of business secrets, opportunities and potential success. But putting all this information to work—including legacy data as well as unstructured data generated by social media and video—is a daunting task.
“In many cases, organizations have transactional data extending back 30 years or more, but they’re also coping with enormous volumes of multimedia data,” says Gary Curtis, chief technology strategist and managing director at Accenture. “Combining everything and making sense out of it is the challenge of the digital age. Currently, few organizations are tapping into the full potential of their data.”
Admittedly, this is a major challenge. A study of 5,000 organizations by research and advisory firm Corporate Executive Board (CEB) found that the ability to analyze and glean insights from data is a priority for global organizations—though few succeed in any significant way.
Only 38 percent of employees and 50 percent of senior managers have the ability to make good decisions based on data, according to CEB research. In many organizations, the greatest risk comes from too much analysis. More than 40 percent of employees trust analysis over judgment, while nearly 20 percent go with their gut.
Employees best equipped to make good decisions—“informed skeptics”—effectively balance judgment and analysis, according to the CEB. These individuals possess strong analytic skills and listen to others’ opinions about analysis—but they are willing to dissent.
The ability to develop a well-defined strategy and implement a viable solution to manage big data isn’t an option in today's data-driven environment. “There’s no escaping it—it’s touching every industry sector,” says Kalyan Viswanathan, director for Global Consulting Practices Information Management with Tata Consultancy Services. “Big data is changing business and creating new risks and opportunities. Savvy organizations are looking for ways to put it to work effectively.”
Managing Data in a New Era
Since the dawn of computing, companies have looked for ways to manage and exploit ever-growing volumes of data. Big data—which spans greater volumes of data and more touch points—is at the center of this trend. Consulting firm McKinsey & Co. estimates that the typical large enterprise today holds somewhere in the neighborhood of 200 terabytes of stored data.
Companies must also cope with rapidly escalating volumes of unstructured data that doesn’t fit easily into a conventional database or data warehouse. Over the last few years, “There’s been an extreme broadening of the types of data that become part of corporate data resources,” Accenture’s Curtis says.
Big data can unlock value. It can make data more transparent and usable on a regular basis; provide insights through richer and broader data sets; create more narrow segmentation so that companies can fashion more targeted marketing campaigns and sales techniques; and help connect the dots to discover new products and services that might otherwise fly under the corporate radar. Organizations that use big data effectively are likely to realize a significant competitive advantage and open up new business opportunities.
But tackling big data isn’t as simple as putting a single system in place and automatically reaping results. It’s necessary to combine the right technologies and tools, build the right workflows and policies, find talent that can tap into analytics and predictive-analytics software, and build products and services that meet the needs of the rapidly changing marketplace.
“It requires extensive investments in data warehousing, data integration, business intelligence, data visualization tools, business analytics and predictive modeling,” Tata Consultancy’s Viswanathan notes. “There’s also a need to apply algorithms that uncover patterns, connections
Amazon at the Forefront
One company at the forefront of big data is Internet retailer and services provider Amazon. Not surprisingly, the company deals with petabytes of data and has an enormous need to leverage it in order to gain insights into customer behavior, improve the quality and cost of operations, drive innovative product features and, ultimately, bolster the bottom line.
Amazon relies on a highly scalable environment, including cloud resources, to derive insights and answers within minutes or hours rather than days or weeks, says Peter Sirota, general manager of Amazon’s Elastic Map Reduce (EMR) initiative.
“Amazon.com uses a wide array of data sources, ranging from unstructured and semi-structured log files coming from application servers to structured data coming from various database systems,” Sirota says. The environment also allows Amazon and its customers to “store and crunch all types of data, including images, videos, DNA sequences, and weather-sensor statistics, as well as data collected from third-party sources, such as Twitter, Facebook and Salesforce … to better manage its product database and analyze the metrics that drive stronger operational performance.”
The online giant relies on open-source Apache Hadoop for distributed processing that includes unstructured data.
Also, Relational Database Management Systems (RDBMS) allow Amazon to run reports and optimize queries on structured data for questions that are known ahead of time.
Hence, “Hadoop and RDBMS are complementary technologies,” Sirota notes. Additionally, by using Amazon’s Simple Storage Service (Amazon S3) to store petabytes of data—along with embedded data processing, analysis and mining tools—Amazon is redefining the big data space both for its internal operations and for outside firms that use its services.
Sirota notes that the cloud has radically changed what’s possible with big data. “You can generate insights from your data more quickly and at a price point unmatched by traditional technologies,” he says.
“The cloud provides instant scalability and elasticity. It enhances your ability and capability to ask interesting questions about your data and get rapid, meaningful answers. … Big data, when analyzed holistically and regularly, has the potential to transform how you interact and react to your customers.”
The Winds of Change
A growing array of companies and government organizations are turning to big data to redefine their business models. Tata Consultancy’s Viswanathan says that advertisers are sifting through mountains of data to better understand buying behavior and what actually drives results. Retailers are combining and correlating customer behavior, psychographics and customer lifetime events to create more accurate profiles.
Financial services firms are connecting diverse data points to create new services and to sell existing services more effectively. And health care providers are using big data to improve outcomes and cost structures.
Among the companies sold on the concept is Vestas Wind Systems, a Randers, Denmark, operator of wind farms used to generate electricity for utilities. The company, with more than 44,000 turbines in 67 countries, uses huge data sets to better understand where to locate turbines for optimum performance.
The company analyzes 178 parameters, including cloud cover, humidity, solar radiation, satellite imagery, deforestation maps and barometric pressure, notes Lars Christian Christensen, vice president of plant siting and forecasting. What’s more, researchers must examine data parameters hour by hour over a 12-year span. “It’s a huge multidimensional cube of information,” he says.
Vestas turned to a big data analytics system from IBM to provide insights for a database that is expected to reach 20-plus petabytes within four years. In the past, analysts were forced to sift through mountains of data—a process that could take weeks and devour very signifi-cant resources.
Today, Vestas is running the IBM BigInsights software on 1,222 connected, workload-optimized System x iDataPlex servers that make up its Firestorm supercomputer. It is capable of 150 trillion calculations per second and can analyze data sets within an hour in order to determine the best locations for turbines.
“We are able to provide answers for our customers quickly to help them build a business case and revenue-generation plan more effectively,” Christensen says. “The system has reduced the complexity of the planning process immensely. We have transformed the way we handle data and the entire analysis process.”
Accenture’s Curtis points out that big data can present some unique challenges. For one thing, it’s necessary to determine how diverse data sets can be combined to produce new insights. This requires analysts and business experts who can think in innovative ways.
For another, an organization must harvest assorted forms of unstructured data, including video clips, audio files and social media feeds. “There must be a way to identify these files and understand what types of data they provide and how it’s possible to use them effectively,” Curtis notes. Although techniques exist—including the use of metadata—it’s an area that still emerging and evolving.
In addition, an enterprise must sort out governance issues, particularly those centering on which business lines own and manage data and who should have access to transactions. Financial firms that operate different business lines—such as retail banking, commercial banking, wealth management, brokerage and other services—are particularly prone to challenges in this area.
In some cases, the lines can become blurred because data might reside on servers operated by a business partner or service provider. “It’s critical that the data is managed effectively and that there’s one golden copy,” Curtis says. “Governance issues must be sorted out.”
Incorporating Social Media
Companies are also looking to incorporate social media into their big data models. Matrix, a Milan, Italy, Internet and online business-services provider, helps companies—from auto manufacturers to restaurants—define their digital strategies and handle brand engagement and reputation management.
Using SAS Content Categorization and SAS Text Miner, the company is able to provide sophisticated monitoring and analysis capabilities revolving around Web crawling, social media conversations and other activities, notes Alessandro Petrella, sales director at Matrix.
The company continually collects data from more than 500 news feeds and online sources and then drops the information into a Netezza data warehouse appliance. Matrix then runs the data through the SAS software to cleanse and categorize it using a set of established business taxonomies. It tunes its algorithm and adds new data elements regularly.
The database now exceeds two terabytes, and its ability to monitor sentiment and attitudes about a company grows daily. “We are able to process data in a fast and effective way,” Petrella says.
A recent survey of business executives conducted by consulting firm Ovum found that up to two-thirds of respondents cited improved operational and strategic decision-making processes and better customer service as the most important business benefits of tapping big data. No less significant: Interest in big data now extends beyond larger enterprises. The firm discovered that 38 percent of companies maintaining data warehouses in the 1 terabyte or higher range have revenues below $50 million.
Ovum expects that big data requirements will become even more pervasive over the next two years, as organizations continue to look for ways to better analyze customer segments, prevent churn, manage public transportation networks and handle myriad other tasks.
Accenture’s Curtis says that all organizations—and specifically IT departments—must understand the dynamics of big data and then develop a clear strategy for both implementing and managing this data over time. He recommends modeling data strategies after those used by companies such as Amazon, Google and Yahoo!, which operate some of the world’s largest data centers and “offer a glimpse of the future of computing.”
Curtis also suggests that technology and business leaders work together to understand big data in a holistic way. “There’s a mutual education process that must occur,” he points out. “Alignment is absolutely necessary to carry a project forward.”
Although big data is still in the early days, it’s a trend that’s here to stay, Tata Consultancy’s Viswanathan states. “Companies are accumulating more and more data, as consumers turn to smartphones, tablets and other digital devices,” he says.
“Organizations that tap into big data and use it effectively are positioned to create a clear competitive advantage. They’re able to understand issues and trends in a way that wouldn’t have been imaginable only a few years ago.”