Best Practices in Disaster Recovery, Business Continuity Planning

It’s every IT manager’s worst nightmare: the call from the CEO to evacuate the data center because of a hurricane or other emergency. That’s what happened to John Chaffe, IT director of New Orleans-based Tidewater Marine, the morning before Hurricane Katrina hit, and he ended up driving critical servers to shared office space in Houston.

Chaffe and other IT executives learned some important lessons about how to plan for business continuity with appropriate storage and backup strategies. And disaster recovery (DR) should be looked at not just in terms of business continuity and applications availability, but also for compliance reasons.

“Three years ago, Hurricanes Francis and Wilma destroyed three of our 11 campuses,” says Dan Weiss, IT director at MedVance Institute of West Palm Beach, Fla., a chain of health professional training schools. “The IT director [at the time] had to operate our data center out of his house for about a week.” Since then, Weiss has put together a solid disaster recovery plan that includes a collocated data center in Atlanta that is out of the path of most hurricanes.

Of course, IT operations can be interrupted for reasons other than natural disasters. So it’s important to be prepared for any type of disruption.

“Our offices are in Manhattan, and one of our building’s other tenants is Microsoft,” says Mark Tirschwell, CTO of Wall Street Systems, a provider of financial services hosted systems. “They can shut down the building for an entire day when they need more power for their servers.”

Tirschwell suggests conducting an extensive risk analysis and getting input from key stakeholders, such as operations people, department managers and application development managers. “Understand the key systems that will keep your business running and what will make a huge difference in your business operations,” he advises.

It’s also important to have a thorough understanding of the systems’ interdependencies and which systems need to be brought up first.

“Systems need to be restored in a specific order and with specific Internet and LAN connections,” says Mike Croy, director of business continuity solutions at Forsythe Solutions Group, a consulting firm in Skokie, Ill. “You need to know the business impact of losing connectivity and the compliance implications of losing critical customer records.”

Once you have this information, make a detailed catalog of all your servers and services and understand the recovery-time objectives for each. Some of these systems need to be up and running within minutes, while others can wait hours or even days.

“We don’t restore all our servers,” says Lee Abner, technology director for CIB Marine Information Services, a Bloomington, Ill.-based subsidiary of CIB Marine Bancshares, a bank holding company. “Within the first 48 hours, we need roughly 30 of our servers that are mission-critical, such as e-mail, document management and check-imaging systems, along with backups of our Active Directory environment and anti-virus protection.” The rest of the company’s systems can wait, he says.

Munder Capital’s disaster recovery priorities depend on the nature of the system. “We take snapshots ranging from every hour to every 15 minutes, depending on our systems,” says Wolfgang Goerlich, network operations and security manager for the Birmingham, Mich.-based investment banking firm. “Our top-tier systems, such as trading, can have an issue if we lose even 15 minutes. Lower-tier systems, such as research, just generate reports once a day, so if they lose data for [a few] hours, it isn’t as big of an issue. With our lowest-tier systems, our DR plan is to go out and buy boxes and bring them up in a couple of weeks.”