Grid Computing: Buying Computing Power, by the Hour

As the senior director of computational genetics at Applied Biosystems, Francisco De La Vega was familiar with the advantages of using clusters of inexpensive computers working together to solve tough scientific computing problems.

But there is still a limit to how many hundreds of computers he can afford to keep on hand. At the time that a monster computing challenge came along in October, the biotech company had actually downsized its in-house computer farm and was looking for an alternative way of handling peak demands for computer power.

That was when the National Institutes of Health (NIH) announced the completion of the first phase of an international project to validate millions of genetic sequence variations.

Applied Biosystems is in the business of providing researchers and biotech companies with genetic assays, which test for the presence of a given sequence.

To stay abreast of the market, Applied Biosystems needed to rapidly translate the NIH data into updates to its product line. Time to market would largely be determined by how quickly it could run through the genetic data through its proprietary analytic programs.

That’s when De La Vega turned to the Sun Grid, the Sun Microsystems service that offers access to computer power on the basis of $1 per CPU hour (that is, the amount of computing that one processor can deliver per hour). Applied Biosystems rented time on 1,000 computers, which were able to complete in a week a job that he had estimated would take three months for him to run on his existing servers.

“It basically saved us a couple of months worth of computing time,” De La Vega says. “The alternative would have been for us to expand our internal compute farm, and pay the electricity and cooling costs on that, even knowing that it would be idle probably 50% of the time.”

This is exactly the concept behind utility computing, the idea that you ought to be able to buy access to computational power as you need it, with capacity that flexibly grows and shrinks in pace with your needs. Grid computing turns out to be a complimentary concept, since grids of computers that can be dynamically assigned to different tasks are one way of delivering utility functionality.

And although Sun actually packages some other rent-a-data center offerings under the Sun Grid banner, Applied Biosystems was tapping into the more grid-like part of it, where Sun supplies access to lots of relatively inexpensive servers working in tandem. Those servers run Solaris 10 but in combination with AMD Opteron processors, rather than Sun’s proprietary CPUs.

Some of the definitions about what makes a grid a grid (rather than just a large computer cluster) talk about the ability for computing jobs to cross organizational boundaries. De La Vega and his team had to package up their Perl and C++ data analysis code, along with the source data, upload it into Sun’s infrastructure, run it, and retrieve the results.

First, they spent a week doing test runs to make sure the software would work the same on Sun’s computers that it did on their own — something they couldn’t take for granted, given that Sun’s infrastructure featured 64-bit processors and the software had been written for 32-bit chips. They had to consider that some subtle difference — for example, in the handling of floating-point integers — would throw off the mathematics of their calculations, De La Vega says. But when no significant glitches cropped up, they charged ahead.

In a couple of respects, Applied Biosystems didn’t quite fit the business and technology assumptions of the Sun Grid.

To make it easy for firms to buy grid computing power on an ad hoc basis, Sun offers it for purchase by credit card, through the same Web portal used to submit and control computing jobs.

But the Applied Biosystems finance people insisted on cutting a purchase order for the estimated 36,000 CPU-hours De La Vega estimated would be consumed by analysis of newly-released genetic data from the National Institutes of Health.

It turned out he had over-estimated, leaving the company with a credit of 10,000 hours to use on a future project. That’s not a problem, he says, as he expects to quickly find a few other projects he can speed to completion using the Sun Grid.

One other assumption that didn’t quite match was Sun’s idea that customers would simply download their data over the Internet when a job finishes.

But the “verbose” data output format of the Applied BioSystems software made that impractical, so De La Vega instead asked Sun to download compressed data onto a USB hard drive and ship it back to him.

Overall, it was a good experience. In terms of the technical challenges of parallel programming, it was nothing new for his staff, De La Vega says, “but now it’s easy to get access to these extra cycles at a very low cost, rather than paying for the overhead of having the computer power in house whether you use it or not.”