Projects: Networks/Storage - Baseline
Home arrow Projects: Networks/Storage arrow Crash at 365 Main Street











Renew Your Subscription


  Projects: Networks/Storage


Crash at 365 Main Street
By Doug Bartholomew

  Table of Contents:
  1. Crash at 365 Main Street
  2. ' What Caused the Outage'
  3. ' How To Guard Against '


When the data center at one of the country's biggest co-location facilities experienced an unprecedented power outage, bringing nearly half its customers down with it, the entire industry learned some painful but powerful lessons.

Rate This Article:
Add This Article To:

Crash at 365 Main Street


( Page 1 of 3 )

At exactly 1:47 p.m. on July 24, miles kelly got the call every CIO and data center manager dreads: The data center had experienced a power outage. Indeed, a power surge had shut off energy to the company's primary data center in San Francisco, and four of the building's 10 backup generators had failed to start. Three computer rooms were down.

That would signal the start of a bad day for any enterprise, but for 365 Main, where Kelly serves as vice president of marketing and strategy, the problem was magnified many times over. That's because 365 Main isn't just any business: It's one of the nation's top data center managers, or co-location service providers. There are more than 75,000 servers in its 227,000- square-foot San Francisco facility, supporting hundreds of customers, including such high-profile companies as Craigslist, Sun Microsystems, Six Apart and the Oakland Raiders.

"When the failure of a data center becomes a bigger issue is when you have all these Web services and start-ups that have their data center services only at this one site," says James Staten, principal analyst in the infrastructure and operations group at Forrester Research.

When the four, 2.1-megawatt diesel engine-generator units failed to kick in as they should have, it was a disaster in the making for 365 Main. The company promotes itself as having "The World's Finest Data Centers," and clients rely on it for constant uptime. Prior to the incident, 365 Main could claim 100% uptime.

But on the afternoon of July 24, 40% to 45% of 365 Main's customers lost power to their equipment for about 45 minutes, Kelly says.

At Sun Microsystems, sites were down 45 minutes to three hours, with most being restored in about 90 minutes, according to Will Snow, senior director of Web platforms at Sun. Although the power was out for 45 minutes, it can take a few hours to bring systems and networks back up and ensure they're working properly.

Snow says Sun had a backup plan for services at 365 Main, but it wasn't complete. "We're in the process of changing our disaster recovery plans to deal with shorter outages," Snow says. "Originally our plans were tailored for more significant outages of four-plus hours, but now we're looking to respond to very short outages such as the San Francisco outage."

At Six Apart, four of the company's Web sites—LiveJournal, Vox, TypePad and Movable Type—were down 90 minutes. On LiveJournal, the company posted an apology, explaining that during outages it would normally display a message telling visitors about the status of the site. "But because this was a full power outage there was a period of time where we could not access or update a status page," the posting explained.

Fortunately, 365 Main was able to manually restart the generators that failed to kick in automatically, which allowed it to operate on backup power until PG&E began delivering a stable power supply.

Next page: What Caused the Outage?



 
 
>>> More Projects: Networks/Storage Articles          >>> More By Doug Bartholomew
 


Sponsored Links
  • Free 30-day endpoint security trial: VIPRE Enterprise
  • Make Your Own Smarter BI Apps--for Free!
  • Quickly fix hotspots with our easy-to-use eval guide
  • Reduce operating expenses with CDW Healthcare solutions.
  • FREE Data Leakage for Dummies Book from Sophos
     
  •  
    FEATURED SPONSORED ARTICLES

    FEATURED SPONSORED MESSAGE

    TechDirect

    Find the trusted vendors and products that will meet your needs, compare the top solution and connect vendors in one place.

    Before you order the next, data management, office automation or IT hardware solution visit TechDirect.

    Click Here

      Brought to You By
     

     

     

    LATEST STORIES


     

     



      1. Your Zip Code:
      2. Need help with something projects: networks/storage related? Check out these VARs within 100 miles of your area:
      3. Beijing Wisdom Science & Technology Co.,
        Feng Grant
        Beijing, Beijing
        View Website

    rss graphic
           Baseline Newsletters