Amazon at the ForefrontBy Samuel Greengard | Posted 2012-01-17 Email Print
WEBINAR: On-demand webcast
Next-Generation Applications Require the Power and Performance of Next-Generation Workstations REGISTER >
Organizations are awash in data. However, tapping into this valuable business resource to achieve maximum advantage requires a clearly defined strategy, along with the right technology solutions.
Amazon at the Forefront
One company at the forefront of big data is Internet retailer and services provider Amazon. Not surprisingly, the company deals with petabytes of data and has an enormous need to leverage it in order to gain insights into customer behavior, improve the quality and cost of operations, drive innovative product features and, ultimately, bolster the bottom line.
Amazon relies on a highly scalable environment, including cloud resources, to derive insights and answers within minutes or hours rather than days or weeks, says Peter Sirota, general manager of Amazon’s Elastic Map Reduce (EMR) initiative.
“Amazon.com uses a wide array of data sources, ranging from unstructured and semi-structured log files coming from application servers to structured data coming from various database systems,” Sirota says. The environment also allows Amazon and its customers to “store and crunch all types of data, including images, videos, DNA sequences, and weather-sensor statistics, as well as data collected from third-party sources, such as Twitter, Facebook and Salesforce … to better manage its product database and analyze the metrics that drive stronger operational performance.”
The online giant relies on open-source Apache Hadoop for distributed processing that includes unstructured data.
Also, Relational Database Management Systems (RDBMS) allow Amazon to run reports and optimize queries on structured data for questions that are known ahead of time.
Hence, “Hadoop and RDBMS are complementary technologies,” Sirota notes. Additionally, by using Amazon’s Simple Storage Service (Amazon S3) to store petabytes of data—along with embedded data processing, analysis and mining tools—Amazon is redefining the big data space both for its internal operations and for outside firms that use its services.
Sirota notes that the cloud has radically changed what’s possible with big data. “You can generate insights from your data more quickly and at a price point unmatched by traditional technologies,” he says.
“The cloud provides instant scalability and elasticity. It enhances your ability and capability to ask interesting questions about your data and get rapid, meaningful answers. … Big data, when analyzed holistically and regularly, has the potential to transform how you interact and react to your customers.”