Inside MySpace: The StoryBy David F. Carr | Posted 2007-01-16 Email Print
How Real-World Numbers Make the Case for SSDs in the Data Center
Booming traffic demands put a constant stress on the social network's computing infrastructure. Here's how it copes.title=Fourth Milestone: 9 Million – 17 Million Accounts}
Fourth Milestone: 9 Million – 17 Million Accounts
When MySpace reached 9 million accounts, in early 2005, it began deploying new Web software written in Microsoft's C# programming language and running under ASP.NET. C# is the latest in a long line of derivatives of the C programming language, including C++ and Java, and was created to dovetail with the Microsoft .NET Framework, Microsoft's model architecture for software components and distributed computing. ASP.NET, which evolved from the earlier Active Server Pages technology for Web site scripting, is Microsoft's current Web site programming environment.
Almost immediately, MySpace saw that the ASP.NET programs ran much more efficiently, consuming a smaller share of the processor power on each server to perform the same tasks as a comparable ColdFusion program. According to CTO Whitcomb, 150 servers running the new code were able to do the same work that had previously required 246. Benedetto says another reason for the performance improvement may have been that in the process of changing software platforms and rewriting code in a new language, Web site programmers reexamined every function for ways it could be streamlined.
Eventually, MySpace began a wholesale migration to ASP.NET. The remaining ColdFusion code was adapted to run on ASP.NET rather than on a Cold-Fusion server, using BlueDragon.NET, a product from New Atlanta Communications of Alpharetta, Ga., that automatically recompiles ColdFusion code for the Microsoft environment.
When MySpace hit 10 million accounts, it began to see storage bottlenecks again. Implementing a SAN had solved some early performance problems, but now the Web site's demands were starting to periodically overwhelm the SAN's I/O capacity—the speed with which it could read and write data to and from disk storage.
Part of the problem was that the 1 million-accounts-per-database division of labor only smoothed out the workload when it was spread relatively evenly across all the databases on all the servers. That was usually the case, but not always. For example, the seventh 1 million-account database MySpace brought online wound up being filled in just seven days, largely because of the efforts of one Florida band that was particularly aggressive in urging fans to sign up.
Whenever a particular database was hit with a disproportionate load, for whatever reason, the cluster of disk storage devices in the SAN dedicated to that database would be overloaded. "We would have disks that could handle significantly more I/O, only they were attached to the wrong database," Benedetto says.
At first, MySpace addressed this issue by continually redistributing data across the SAN to reduce these imbalances, but it was a manual process "that became a full-time job for about two people," Benedetto says.
The longer-term solution was to move to a virtualized storage architecture where the entire SAN is treated as one big pool of storage capacity, without requiring that specific disks be dedicated to serving specific applications. MySpace now standardized on equipment from a relatively new SAN vendor, 3PARdata of Fremont, Calif., that offered a different approach to SAN architecture.
In a 3PAR system, storage can still be logically partitioned into volumes of a given capacity, but rather than being assigned to a specific disk or disk cluster, volumes can be spread or "striped" across thousands of disks. This makes it possible to spread out the workload of reading and writing data more evenly. So, when a database needs to write a chunk of data, it will be recorded to whichever disks are available to do the work at that moment rather than being locked to a disk array that might be overloaded. And since multiple copies are recorded to different disks, data can also be retrieved without overloading any one component of the SAN.
To further lighten the burden on its storage systems when it reached 17 million accounts, in the spring of 2005 MySpace added a caching tier—a layer of servers placed between the Web servers and the database servers whose sole job was to capture copies of frequently accessed data objects in memory and serve them to the Web application without the need for a database lookup. In other words, instead of querying the database 100 times when displaying a particular profile page to 100 Web site visitors, the site could query the database once and fulfill each subsequent request for that page from the cached data. Whenever a page changes, the cached data is erased from memory and a new database lookup must be performed—but until then, the database is spared that work, and the Web site performs better.
The cache is also a better place to store transitory data that doesn't need to be recorded in a database, such as temporary files created to track a particular user's session on the Web site—a lesson that Benedetto admits he had to learn the hard way. "I'm a database and storage guy, so my answer tended to be, let's put everything in the database," he says, but putting inappropriate items such as session tracking data in the database only bogged down the Web site.
The addition of the cache servers is "something we should have done from the beginning, but we were growing too fast and didn't have time to sit down and do it," Benedetto adds