CGG Taps PC Clustering in Oil Hunt

By 2020, experts say, worldwide demand for oil will truly begin to outstrip supply and prices will skyrocket. We’re running out. Anxious fuel companies want to suck as much oil from existing wells as they can, and don’t want to waste time puncturing new ground that won’t produce. PDF Download

To find good places to drill, ChevronTexaco, ExxonMobil and others need data; it often comes from third-party research firms such as Houston-based CGG Americas. In one common test, scientists with vibrating trucks and air guns spend several days shooting hundreds of thousands of sound waves into land and undersea terrain to find out what’s underneath. They record how the waves travel over several square miles—where they meet obstructions, where they dissipate, how they echo.

CGG Americas then feeds the resulting terabytes of data into its proprietary software to create images of the geophysical action. To help guide drillers, the pictures can be manipulated to show layers, cross-sections, cubes or other three-dimensional representations. Time lapses also can be added to make predictions about oil reserves.

Historically, supercomputers—costing millions of dollars each—did that complicated computing work. But since 2000, CGG Americas has used a giant cluster of much less-expensive Dell PCs running the Linux operating system.

A cluster is a group of small computers that, when linked by a network and managed centrally, can act as one very powerful computer. At CGG Americas, nearly 3,000 Dell PC servers—each with two CPUs, for a total of about 6,000 processors—are connected in a cluster that crunches the same amount of seismic data as the company’s old supercomputers. But the cluster does it more quickly and cheaply.

Even if a company must buy thousands of processors to attain the raw power of a supercomputer, making the switch often makes financial sense. First, the price per processor is far less in a cluster. For example, an IBM Regatta, which is a supercomputer-class Unix machine with 64 processors that CGG Americas used in the late 1990s, cost the company $1.5 million to $2 million. Two years ago, by comparison, CGG Americas paid Dell $830,000 for an initial cluster of 256 machines, or 512 processors. That’s about $31,250 per processor for the supercomputer compared to $1,621 per processor for the cluster.Second, clusters built well are more reliable than supercomputers and therefore can get real work done faster. True, no standard formula exists for calculating how many PC processors it takes to replace a given supercomputer. The most important variable is whether your critical software applications can be reworked to perform across many distributed processors. In a cluster, a computing job is split up, with different pieces allocated to different processors. After each of these sub-job runs, the results are combined to yield finished data output.

With its supercomputer, once CGG Americas launched a particular seismic computing job, it had to be continuous. Any disruptions, such as a processor failing, meant starting over. Jobs therefore took a few months to finish. The cluster, however, has built-in fail-over processors so that if one CPU crashes, others kick in to take over its workload.