It`s a Jungle in Here

 
 
By Doug Bartholomew  |  Posted 2008-02-21
 
 
 

Here’s a pop quiz:

Which of the following does Amazon want to sell you?

A. The book Eat, Pray, Love by Elizabeth Gilbert

B. A Nintendo Wii

C. An Apple Mac laptop

D. One terabyte of storage

E. The computing power of 10,000 servers

F. Thought processes of a million human brains

G. All of the above

If you chose “all of the above,” there’s a strong chance you’re one of the more than 330,000 customers—predominantly developers and startups—tapping Amazon.com’s vast computer network to take advantage of some of the world’s most inexpensive computing power and data storage.

With the launch of the Simple Storage Service (S3) in March 2006 and the Elastic Compute Cloud (EC2) in August 2006, the $14.8 billion e-commerce giant served notice on the IBMs, Microsofts, Googles and Sun Microsystems of the world: Amazon wants a piece of the action.

“Amazon is at heart a technology company—a big, massive Internet application,” says Adam Selipsky, vice president of product development and developer relations at Amazon Web Services (AWS).

“It’s really a new line of business for us, these Web services that are measured and metered very much like a water bill or a phone bill,” adds AWS lead evangelist Jeff Barr.

By stepping beyond its circumscribed sphere of selling books, toys, music, DVDs and computers online, Amazon is not just poaching far from its own turf by penetrating an entirely new market, it’s also challenging the established players in the emerging software as a service (SaaS), managed services and grid-computing sectors. It’s as if General Motors suddenly opened up its factories worldwide to allow thousands of companies to make their products using its equipment and facilities.

“This definitely is a disruptive technology,” says James Staton, an analyst at Forrester Research. “It’s totally different from the mainstream offerings today. It’s something that can be used without anyone’s authority, where you can pay with a credit card. For a high-tech startup, for, say, two guys in a garage, it means they can start and grow a $100 million business without ever having bought a server.”

Pay by the Drink

If ever there was a business that needed shaking up, it’s the cozy world of enterprise computing. Every year, companies large and small spend billions of dollars on IT infrastructure and expert staffs to build and maintain complex systems. Software licensing, hardware integration, power and cooling, and staff training and salaries add up to a massive sum for an infrastructure that may or may not be used to its full capacity.

Then came the idea that companies could pay for computer processing “by the drink”—that is, pay for each transaction, minute of server power or megabyte of data stored. This approach, which is generally known as utility computing but also includes SaaS, lets companies use and pay for computing resources as needed, in the same way they use telephone and electric services.

The goal is to maximize the efficient use of computing resources and minimize user costs. Ideally, users can dial up or dial down usage in real time depending on operational demands. However, for several years, utility computing remained more of a concept than a reality, according to Forrester’s Staton.

What makes Amazon’s online computing services different from typical utility computing is that the company already has the infrastructure in place to provide these services.

“What Amazon is doing is cloud computing—they already have an infrastructure, with utility billing as a byproduct of that resource,” Staton says. He defines cloud computing as “a set of highly scalable, Internet-accessible infrastructure resources capable of hosting end-customer applications.”

In Amazon’s case, there is no doubt about the infrastructure being in place and ready for action: The company invested more than a dozen years and $2 billion assembling its network of servers and online resources for its e-commerce business. The infrastructure was designed with sufficient capacity to handle the most prodigious holiday traffic spikes.

This past holiday season, Amazon’s e-commerce site handled customer orders for a whopping 5.4 million items on Dec. 10th alone—more than 60 items per second. Its fulfillment network shipped 3.9 million units in a single day. Nintendo’s Wii game systems flew out the doors at an incredible clip—17 per second—when the coveted games were in stock.

Amazon’s e-commerce success hasn’t gone unnoticed on Wall Street. The company’s shares last year soared 135 percent, fractionally better than tech darling Apple’s and nearly three times the stock-value growth of search juggernaut Google.

Both revenue and profit spurted last year, with sales up 39 percent (from $10.7 billion in 2006 to $14.8 billion in 2007), and net income up 150 percent (from $190 million in 2006 to $476 million in 2007). Although its Web services foray has yet to be viewed as a major revenue contributor, Amazon’s computing-services revenues range from $46 million to $92 million, analysts estimate.

Unlike conventional software vendors that must build vast data centers to get into the SaaS business, Amazon is tapping its existing—and often underused—infrastructure. “Amazon has a lot of capacity that sits idle for a lot of the year,” AWS’ Selipsky says. It’s as if a thoroughbred racehorse, restricted to running in cheap claiming races for 50 weeks of the year, suddenly was spurred on to compete in the Kentucky Derby.

Amazon has long been hip to the idea of making a buck off its various e-commerce systems. Nearly one-third of Amazon’s revenue comes from third parties, such as merchants that use its various technologies. “This is a multibillion-dollar business for us,” Selipsky says.

Shaking Up the Establishment

Amazon’s push into Web services is also drawing the attention of traditional technology companies. The vendors that stand to lose business to utility computing are fighting back and, in some cases, fighting fire with fire.

Just two months after Amazon sent aloft its Elastic Compute Cloud, IBM filed a pair of lawsuits against the retailer. Big Blue, which coined the term “grid computing,” took aim at the online retailer’s core business in what could be considered a shot across Amazon’s services bow.

IBM charged that Amazon had infringed on five of its patents governing the presentation of online advertising, as well as the way data is stored in an interactive network. IBM also claimed Amazon had taken its idea for the “weighted user” to create its well-known feature that enables a site to recommend books to customers based on previous purchases by other customers with similar reading tastes.

Microsoft is also fighting back, but its primary target is, of course, Google. After months of rumors and speculation that it was interested in acquiring Yahoo, Microsoft pulled the trigger last month with a $44.6 billion offer to buy the number-two search and online advertising company (see “Battle of the Brands” on p.12). Yahoo would give Microsoft more than just search technology: It would also provide the massive server infrastructure to deliver the online services and applications Google currently offers. (At press time, Yahoo had rejected Microsoft’s bid as inadequate.)

Other entrants in the utility computing space include tech distribution giant Ingram Micro, which launched its Seismic program last year. It offers a massive infrastructure that leases space to IT resellers and service providers for the delivery of managed services.

Smaller India-based startup Zenith Infotech came to the United States with a similar model, providing inexpensive backend resources that resellers can use for everything from monitoring and managing customer networks to delivering remote backup and storage services. And the concept of specialized enterprise-class SaaS continues to catch fire as marked by Salesforce.com rival NetSuite’s $160 million IPO.

The competitor that most closely mirrors Amazon’s services is Sun. Under the slogan “Own the results, not the hardware,” Sun’s Network.com site offers pay-per-use computing resources provided by the company’s Sun Grid Compute Utility. The utility consists of a package of powerful computational applications that enable software vendors and developers to build, test and deploy on-demand applications over the Web.

Sun’s applications library contains computational mathematics, computer-aided design engineering tools, design automation, molecular simulations, weather prediction systems, and even financial services applications for pricing and risk management analytics.

Despite the notion of large companies being “disruptive” by entering noncore businesses, forays by large tech-savvy companies into new arenas—especially by acquisition—can be risky. Just look at what happened when eBay strayed from its knitting to purchase online telephony service Skype for $2.6 billion in 2005. In October 2006, eBay took a $1.4 billion writedown when Skype failed to deliver the heady returns needed to justify the price eBay had paid for it. Now, with CEO Meg Whitman gone, eBay is reportedly considering spinning off Skype.

Amazon’s foray into grid computing pits it in a race against Google, Microsoft and a host of other companies that are rumored, according to Forrester’s Staton, to be vying to build the next big platform for the Web. Amazon wants its services to be that fabric—a layer of basic services on top of which everyone else builds their Web sites.

So far, Amazon seems to have staked out a major chunk of Web computing turf. In last year’s fourth quarter alone, the bandwidth used by customers of EC2 and S3 was greater than the bandwidth used by all of Amazon’s global Web sites combined.

Windows Limitation

A significant barrier that could keep Amazon from grabbing too big a piece of the enterprise computing pie is its limited usefulness to Windows-based shops. Although Amazon’s Selipsky says the S3 data storage service is platform agnostic, the EC2 computing service is open only to Linux-based platforms. EC2’s default operating system is Red Hat Linux, which many users replace with their own preferred version of Linux.

Microsoft currently doesn’t offer a license to fit this type of computing infrastructure, Forrester’s Staton points out. And even if it did, it’s not a slam-dunk that Amazon would embrace Windows on EC2 for cost reasons, because doing so could be prohibitive to the company’s pricing strategy of offering computing power at nickels and dimes per hour.

“We do intend to support Windows on Amazon EC2,” Selipsky says, “but currently there is no licensing model that supports the hourly pay-as-you-go pricing that has sparked such interest in Amazon EC2.”

The lack of support for Windows-based applications isn’t a serious limitation for AWS, according to Selipsky. “We typically see that large organizations have a mix of technologies, and many can make full use of Amazon EC2 in these heterogeneous environments,” he says. “The rapid uptake we’ve seen for this service speaks to that fact.”

For small companies, the Windows issue may be even less of a deal-breaker. At New York-based startup Animoto, a Web-based service that puts customers’ still photos to their choice of music, getting the sheer computing chore done as quickly and cheaply as Amazon can do it is what’s most important (see “Developers, Entrepreneurs Tap Into Amazon’s Cloud,” p.31). Even so, the process is such a chip-burner that it typically takes up to 10 minutes, and sometimes longer, for the music and photos to be combined into the finished product.

“Our customers upload JPEG images and MP3 music files, and our algorithm renders the video and synchronizes the music to the images, so we replicate what real producers of TV and video would do,” says Brad Jefferson, a co-founder and CEO of Animoto. “We loved our idea … but we knew it would be storage and server intensive. With Amazon Web Services, we are able to keep our prices low.”

Animoto charges $3 to do a single video with 10 photos or $30 per year for unlimited videos. One well-known customer, Republican presidential candidate John McCain, has an Animoto video on his MySpace page.

From the get-go, Animoto found itself hampered by EC2’s inability to run Windows-based applications. “We had a Windows dependency as part of an infrastructure stack, so we had to get rid of that to use Amazon Web Services,” Jefferson says. “You can’t have Windows machines within EC2.”

Animoto uses not only EC2 and S3, but also Amazon’s Simple Queueing Service, which acts as a sort of bus stop, readying Animoto videos until they can be distributed to an available server in EC2 for rendering into a video and finally being stored on S3. “That entire process is done using Amazon Web Services,” Jefferson says. “In fact, our entire infrastructure is on AWS.”

Instant Scalability

One of the biggest advantages of Amazon’s Web services is its scalability. “We make it easier for self-funding, self-scaling businesses to succeed, because there are no gigantic technology investments they’re liable for,” Amazon’s Barr says.

Start-up CEOs and other entrepreneurs like the idea of being able to launch a new product or service for which the real-world demand is largely unknown, and still have the confidence that their support systems will handle whatever gets thrown their way—whether that’s 100 customers a day or 10,000 per hour.

“You have to plan for success,” Animoto’s Jefferson says. “Do we make it so only a hundred users can log in per week? It’s silly to prevent growth. We wanted a way we could scale to the world on day one.”

Jefferson and his partners looked at a number of other options, such as buying a batch of servers. “But a lot of those servers would have been dormant after the initial launch,” he says, “and it would have meant a huge capital expenditure.”

Amazon even tailors its offering according to the size of the computing job. Amazon charges 10 cents per compute hour per instance on a machine with a single processor. For a four-processor instance, for example, the cost is 40 cents per compute hour, or 80 cents per hour for an eight-processor machine. Similarly, data storage costs 15 cents per gigabyte per month. And there is no charge for moving data from EC2 to S3.

By contrast, the Sun Grid Compute Utility charges users $1 per CPU per hour. Sun aggregates each customer’s job usage and then rounds it up to the nearest whole hour. For instance, a job that uses 1,000 CPUs for one minute would be billed as 1,000 CPU minutes or 16.67 CPU hours, with the latter figure rounded up to 17 hours, for a total of $17.

As of January, Amazon customers had stored 14 billion objects on S3. Amazon won’t say what its data storage capacity or computing capacity limits are, nor whether the company has had to purchase additional hardware to handle AWS growth.

For startups—and even for large companies looking to do specific computing projects—this incredibly minuscule price tag for mojo-size computing tasks is a big draw, because it means they can forgo investing in yet another server farm with all of the associated operating and maintenance costs. Microsoft is using the storage service to help speed software downloads, for example, and Linden Lab is using it to deal with the blizzard of software downloads for its popular virtual world Second Life.

For some large companies, such as SanDisk and The New York Times, Amazon made it possible to launch new products or additional services.

SanDisk, the $4 billion maker of flash drives, uses Amazon’s data storage as an automatic backup for its new Cruzer Titanium Plus. SanDisk adapted BeInSync’s application for automatic backup of the Cruzer, allowing the company to promise customers automatic backup of their data even if the device is lost or stolen. “Amazon Web Services made it possible for us to pursue this very innovative new idea,” says Mike Langberg, a SanDisk spokesman.

For The New York Times, Amazon’s low cost was the main selling point. “The cost structure is so minimal that we didn’t have to make the traditional budget requests to get our project done,” says Derek Gottfrid, the newspaper’s senior software architect.

The project was massive and complex. America’s “newspaper of record” wanted a way to archive 11 million articles published from 1851 to 1980 as PDF files and make them available on the Web.

“We wanted a system that would be scalable, could handle a lot of traffic and could generate PDFs,” Gottfrid says. “We also needed a place to store these files. We weren’t really sure it would work using EC2 and S3, but we thought it was worth a chance to test it and see.”

Gottfrid was able to do the whole job in a few days. Each article that was to be put into PDF format consisted of a series of TIFF file images that had to be assembled and put together in a particular geometric arrangement, including photos, captions, headlines and columns of text.

One of the biggest challenges was managing the large number of computer instances simultaneously, because running computations on large data sets is difficult to set up and manage. For that, Gottfrid took advantage of Apache’s Hadoop, an open-source implementation of the MapReduce idea developed at Google.

Hadoop, which provides a framework for running large data processing applications on clusters of commodity hardware, enabled Gottfrid to use EC2 to test-generate a few thousand articles using only four EC2 instances.

Upon successful completion of the test, Gottfrid calculated he could run through all 11 million articles in just under 24 hours by harnessing 100 EC2 instances. The project generated another 1.5 terabytes of data to store in S3. He even ran it a second time to fix an error in the PDFs.

“Honestly, I had a couple of moments of panic,” he wrote in his blog. “I was using some very new and not totally proven pieces of technology on a project that was very high profile and on an inflexible deadline. But clearly it worked out, since I am still blogging from open.nytimes.com.”

Don't Try This at Home

Gottfrid’s experience demonstrates a key aspect of cloud computing via Amazon Web Services: Don’t bet your entire business on it—at least not yet. “You have to be pretty knowledgeable to use the Amazon Elastic Compute Cloud,” says Forrester’s Staton. “You have to set up your application. And it doesn’t automatically scale or provide backup—you have to do those things.”

After an outage on EC2 last fall, says Forrester’s Staton, customers who hadn’t backed up their applications by copying them to S3 lost them. Staton likens this experience to what happens when an Amazon.com retail customer loses connectivity while shopping online.

“You’ve put stuff in your shopping cart, but you've lost connectivity,” he says. “Afterward, you have to put everything back in again. It’s the same thing with EC2 unless you back it up.”

Customers sure noticed last fall’s outage. “We were affected,” says Animoto’s Jefferson. “We had several instances that went down. Luckily for us, it wasn’t a complete outage, because we had multiple instances of the servers.”

The bottom line for Amazon’s Elastic Compute Cloud? “At this stage,” Forrester’s Staton says, “it’s still very much a do-it-yourself service.”

As a result, there are a lot of companies trying it out, but not many are betting their businesses on it. “We see a lot of enterprises tipping their toes in and trying it out,” Staton adds, “but not a lot running things that they count on.”

Although Amazon claims it essentially can scale to the stars, even its most stalwart customers aren’t so sure. Says Animoto’s Jefferson, “I think it will be very interesting to see if we could ever outgrow them.”