Pop CultureBy Baselinemag | Posted 2004-09-08 Email Print
No-Size-Fits-All! An Application-Down Approach for Your Cloud Transformation REGISTER >
Three years after its tragedy on September 11, Cantor Fitzgerald uses pop quizzes to make sure it's ready for any disaster it could face.
On September 11, 2001, the bond trading operations of Cantor Fitzgerald were destroyed in the terrorist attacks on the World Trade Center, and 658 people died. Nonetheless, the electronic markets supported by the company's eSpeed subsidiary were restored within 48 hours.
In the three years since, Cantor Fitzgerald has launched new businesses, entered new markets and turned its focus to equity trading and investment banking. But, because of its life-shattering experience, the company is constantly on the lookout for the next disruption. Now, it relies on a simple technique to make sure it's prepared for disaster: the "pop quiz."
Instead of monthly, quarterly or annual reviews, eSpeed tests its trading systems, technology infrastructure and support-continuity plans year-round at unexpected times. There is no formal pattern or schedule.
"The pop quiz cycle is ongoing," says Joe Noviello, chief product architect for eSpeed and former CIO. "You can't say 'we'll do this next Friday' and let people prepare for it. We also try to test with someone other than the group that has been directly involved with disaster recovery planning."
The pop quizzes work like this:
Chief information officer Jim Johnson and his technology staff pick an inconvenient time to wake up the staff in Cantor's London data center. This could be dinnertime or 3 a.m.
Then comes the mission: Rebuild a mortgage-backed securities trading system from scratch, say. Also, a time limit: One hour. And the challenge falls to staff that most likely does not include any of the disaster recovery planners.
Why? It's the only way to address whether the procedures those planners have set out are clear and precise.
Questions that get reviewed in each pop quiz include: Is the documentation right? Could the network administrator rebuild the system if the top executives weren't available? Did the team utilize its U.S.-based resources correctly? Will this plan work for other systems, such as customer service?
Other quizzes may not be as extreme as the London example. Traders and managers working at Cantor and eSpeed's midtown Manhattan office may get a phone call before work telling them to move to a backup center in New Rochelle and operate there, instead.
Similarly, workers in customer service, accounting or trade clearing may come in and find a message telling them they must report to work in New Rochelle. And see if they can do so, without a blip.
At New Rochelle, they will find 50 seats with workstations capable of toggling between job functions, from executive level to back-office transaction tracking. At any time, on any day, any Cantor or eSpeed employee should be able to get their job done in this facility. Indeed, what started out as a backup site now has become Cantor's primary U.S. data center.
"In my experience, this type of testing is very uncommon," says Scott Lundstrom, an analyst at AMR Research. "Even companies that test monthly and quarterly often don't do true testing with full restarts from backup machines and no notice."
According to David Palermo, vice president of marketing for data recovery firm SunGard, most of his company's clients test business continuity plans annually.
Given Cantor Fitzgerald and eSpeed's history, the attention to business continuity is not surprising. Following the 9/11 attacks, Cantor rebuilt its employee base to 1,100, excluding a yet-to-be-completed spin-off, on par with pre-9/11 levels. ESpeed, which automated many process to operate with fewer people, has a staff of 365, down from 486 before the attacks.
Cantor's advanced disaster recovery planning saved the company in the wake of the destruction. Cantor had a plan, dual architecture that replicated all machines at both New Rochelle and the World Trade Center, and a well-drilled staff that could execute under duress.
Cantor and eSpeed's frequent quizzes have another benefit-they allow the companies to test the integration of systems behind new businesses and services as they are being launched. Lundstrom says a big problem with many continuity plans is they stay static, even when the business changes through mergers, new services or new processes.
Indeed, Cantor and eSpeed are adding to their business continuity plans to account for new systems to support the spin-off of Cantor's telephone brokerage business and eSpeed's entry into markets involving mortgage-backed securities, equities and foreign currencies.
Plans will also have to be tweaked to account for a new data center. Cantor and eSpeed are moving in the first quarter of 2005 to a 125,000-square-foot headquarters in midtown Manhattan.
A data center there will replicate the capabilities found in New Rochelle. Plans on which ones will be considered primary and secondary haven't been laid out, according to Noviello.
But a single level of redundancy is no longer enough. Come mid-2005, eSpeed will operate two data centers in the U.S. and two in London. And Cantor is considering adding a third in the U.S.
"One lesson we've learned is that having many data center locations is fine," Noviello says. "Having just two is not."
How To CONDUCT POP QUIZZES
Staff should be surprised, as in a real crisis.
Who least expects it?
Avoid Using In-House Experts
Have non-technical managers lead recovery to see if documentation is clear.
Assume the Worst
Figure that systems have to be rebuilt from scratch.
Wait Until After Dark
Disasters often happen off-hours. Tests should, too.
Set a Time Limit
Every minute lost can't be regained.