The atmosphere was tense
For Joseph Noviello, September 11 began at 6:30 a.m. with a phone call confirming that an annual fishing trip with colleagues at the Cantor Fitzgerald bond trading firm was still on, despite some foul weather offshore.
At 8:30, though, the charter-boat captain called to cancel. Noviello, then the chief technology officer of Cantor's eSpeed electronic trading subsidiary, considered heading downtown to his office on the 103rd floor of the north tower of the World Trade Center. Instead, he decided to take the day off anyway.
Minutes later, the most intense two days of his life would begin. The first plane hijacked by terrorists would hit Cantor's building. Watching on TV from his Manhattan apartment, Noviello had no way of knowing what ultimately lay in store. But clearly, a disaster of a proportion he had never had to deal with was unfolding.
Fortunately, he had a plan to follow.
That plan may have saved the companies. No firms suffered worse fates on Sept. 11 than Cantor Fitzgerald and its electronic marketplace unit, eSpeed. More than 700 employees of the two companies died in the destruction of the World Trade Center's north tower, where Cantor and eSpeed shared their headquarters and a vital computer center.
Yet eSpeed was up and running when the bond market reopened at 8 a.m. on Sept. 13, little more than 47 hours after the disaster. That was possible in part because of some lucky timing. But the rapid response was really enabled by careful planning, help from other companies and an indomitable team determined to show that even if their colleagues might be taken away, their spirits couldn't be. Their riposte to terror would be virtually uninterrupted service to the financial industry.
"The difference for us was the planning we had in place," says Noviello, 36, who was promoted to eSpeed's CIO after the disaster. He is sitting in the cramped, windowless office he now shares with another executive at the Cantor/eSpeed data center in Rochelle Park, N.J. "We understood that we needed to be able to respond to some sort of outage, that there is no acceptable downtime."
To that end, eSpeed's systems were built on a dual architecture that replicated all machines, connections and functionality at the World Trade Center and at a Rochelle Park site, with a third facility in London. "These are decisions made on logic, not technology," Noviello says upon reflection. "You don't buy one of anything, you buy at least two."
By 8:50 a.m. on that Tuesday, Noviello had heard from Jim Coffey, the man responsible for the Rochelle Park mirror site. Coffey was on his way to Jersey.
"It was instinct," says Noviello. "We always talked about the event that would cause [us] to run things out of here."
Unable to reach his friends or colleagues in the building by phone or pager, he called eSpeed's London office to get a read on what systems were still functional at the trade center.
"That would give me some idea of the status on a particular floor," he says. He watched in horror as the building collapsed. The line to London, which had been routed through Cantor's New York office, went silent.
ESpeed, which operates as a freestanding business and also serves as the trading engine for its parent company, would lose 180 employees, including about half of its U.S.-based technology staff. Among them were eSpeed President and Chief Operating Officer Frederick T. Varacchi, a former CIO of Cantor and the driving force behind eSpeed, and Joseph Giaccone, architect of eSpeed's disaster-recovery plan.
But eSpeed had some important assets left. Most of the top technology executives had been out of the office, including Matt Claus, eSpeed's current CTO and Noviello's right-hand man, who had been scheduled to go on the fishing trip.
"People have stories about why they made it," says Noviello. One senior network engineer was late to work because his wife burned the cake she was baking for their daughter's school and he had to run to the grocery store for ingredients. Some workers were on vacation; others were simply not due in on the 24-hour, three-shift schedule. So about 306 survivors, including more than 260 IT employees, were ready to work.
By mid-morning Tuesday, some staff members had been dispatched to Rochelle Park to conduct an inventory of which systems were running and which were not, the status of communications with customers and the actual transactions from the moment of impact that needed to be settled....">
The atmosphere was tense, with people not knowing what had happened to their friends or colleagues. "For days, every time a new face came in the door it was an emotional release," says Noviello. "There was a disaster-recovery contact list, but people were seeking to find each other not for work but to find out who was OK."
Noviello would divide most of the next two days between a Manhattan command center set up in borrowed space and his home, using his cell phone, his fiancée's cell phone and a land line. Even so, he frequently patched into the company's London communications system to gain access to numbers he couldn't reach from New York.
Beyond the technical questions were operational details like advising staff on public transportation options to the suburban site, reestablishing shifts and making sure there were counselors on duty. Conference calls every two hours kept track of milestones and objectives. "We were talking at 2 a.m., at 4 a.m.," says Noviello. "Who is sleeping during something like this? Work is great therapy."
On Wednesday, Cantor Chairman and CEO Howard Lutnick told him the bond market would reopen the next morning, and asked if the system would be ready; Noviello spoke to Claus and his lieutenants and came back with a yes.
"We never considered not being there on Thursday," he says. "There is too much dedication and enthusiasm in this group. We said we will be there for ourselves and our friends."
Working in the cold, crowded data center, which normally houses no more than a handful of workers, eSpeed's people relied on their knowledge of the systems and procedures instead of following a written plan.
"We had some major hurdles to cross, but we approached them systematically, and each step worked so there was no need for a plan B," says Noviello. "For days it looked like chaos, but people knew what they needed to do." Senior people worked in shifts, and workers took charge and stepped up as needed. It helped that as a small company, people had had lots of exposure to different systems.
None of this effort would have succeeded without the duplicate architecture in Rochelle Park. Yet Cantor only started moving into the facility in February. The previous disaster-recovery plan was based on a co-location plan at another New Jersey location, but Giaccone, who ran eSpeed's global infrastructure, had pressed for more.
"Joe pushed every day for another facility," says Noviello. "Our ability to work now is a testament not just to the work of these people but to the planning by our former colleague."
From day one, Rochelle Park was seen as a concurrent system, not a disaster-recovery site. The shift was driven by eSpeed's role as the largest player in electronic bond-trading, which meant uninterrupted service was an imperative. The nondescript building in a blue-collar town was perfect—a former telecom facility across from another telecom building. Systems alternated between the trade center and the mirror site, with particular products (e.g., zero coupon bonds) running live for a month at one location and then switching to the other; about half of the company's approximately 40 products were live at each location at any given time. "In that sense we had run our disaster-recovery tests the day before," says Noviello.
The mirror site and the World Trade Center were connected by a high-speed optical line, over which eSpeed linked the storage area networks at each site. Sybase data-replication software mirrored critical databases between the sites. Half of the company's Microsoft Exchange e-mail servers were also located full-time in Rochelle Park.
All that redundancy would be stretched to the limit as eSpeed worked to overcome the technical hurdles between them and the opening of the bond market Thursday morning. Two of those hurdles were huge: the loss of eSpeed's private network connections and the destruction of the company's ability to handle fulfillment of trades.
The critical transaction-processing systems that powered eSpeed's and Cantor's brokerage services had remained intact through the mirror in Rochelle Park. But though the matching engine stayed up and running, all connectivity to the private network had been through the optical connection to the World Trade Center. No one could reach eSpeed's servers from the private network to execute trades.
On Sept. 11, both of eSpeed's redundant private networks serving the company's biggest brokerage customers were still located at the World Trade Center. Plans to connect its network to the public telephone network at Rochelle Park had not yet been finalized. "We've only had the building since February," says Noviello. "It wasn't fully built out, and there were some areas where we wanted to put in finishing touches."
But the Rochelle Park site could connect to the Internet and the London data center. Immediately, Noviello's team started contacting customers to arrange alternative ways for them to connect to eSpeed.
Customers with overseas offices connected to the London data center were rerouted across their own networks to London. ESpeed worked with customers to reconfigure its servers on their premises to point to London, and moved or expanded customers' access to the eSpeed network to connect to that site. For customers without overseas private networks, eSpeed worked to get them access over the Internet until the customers could remap some of their digital connections to the Rochelle Park facility.
The second problem, of fulfilling trades, was more difficult to resolve. Although Noviello's team was able to restore some of the applications that handle the financial back-office functions of the trading system, it was unable to reproduce the settlement system for all of eSpeed's products at the Rochelle Park site. Without being able to clear or settle transactions, eSpeed would be unable to open for business....">
Here, help arrived in the form of one of eSpeed's competitors. ICI/ADP, another electronic trading company, offered to take care of eSpeed's clearing and settling of transactions through its own connection to banks. By Wednesday night, the eSpeed team had mapped its financial back-office system to ADP's system, and had successfully sent test transactions to J.P. Morgan Chase & Co. and other banks.
The cooperation of other companies, including vendors and fellow financial firms, turned out to be key to Cantor/ eSpeed's quick recovery. ADP got the back-office component hooked up overnight, Compaq delivered 100 desktop computers at 2 a.m. on Wednesday and Verizon expedited the installation of voice lines and the transfer of some of eSpeed's digital circuits. Cisco provided a phone system based on Internet protocols, in a hurry. Microsoft had an NT team continuously on hand, as eSpeed's server and desktop maintenance group had been especially hard-hit. Meanwhile, UBS PaineWebber provided temporary office space in Manhattan. When units for running tape backups were in short supply, Claus was able to borrow one from a friend at another company by driving to his office at 11 p.m.
Trading in eSpeed stock resumed the first week of October. Shares, which had closed at $8.69 on Sept. 10, closed at $5.91 on Oct. 5. The firm was weakened by the loss of so many people and the related shutdown of its voice-broker business.
But it survived as a viable business. Thanks to planning, the company can keep operating, even if something should happen to Rochelle Park. Its data center in London will serve as the mirror site going forward.
And going forward, the company's systems should be even more resilient. "We are learning a lot of lessons as we are restoring the system," says Noviello, including how to automate more aspects of bringing systems back up. "And we are not restoring our bad habits."
8:30 am Decide to take day off after fishing trip canceled.
8:47 am The north tower of the World Trade Center is hit by airplane. Data center could be affected.
8:50 am Receive call from Jim Coffey, manager of enterprise systems group. He's on his way to Rochelle Park, N.J., mirror site.
8:51 am Call London office over tie-line connection on 103rd floor of the trade center. Communications to data center still live.
9:03 am The south tower is hit by another airplane.
9:50 am The south tower collapses.
10:29 am North tower collapses; connection to London and data center is lost.
11:00 am An emergency team of IT staff is dispatched to Rochelle Park to assess the situation. A virtual private network connection to Rochelle Park data center established. Cable-modem service, cell phones, land lines used to direct restoration operations.
12:00 pm First milestones for bringing eSpeed back on line are set. Search for missing technology staff members continues.