Favorite Networks-From-Hell StoriesBy David Strom | Posted 2008-10-30 Email Print
Bad adapters. Clueless cleaning crews. Fires. Bugs. These are some problems facing network managers.
Over the years, I’ve witnessed some very strange network operations. Maybe it just comes with the territory, or maybe it’s just knowing so many hard-working IT managers who have some great stories to tell.
One of the early tales was of a Novell file server that went offline frequently in Norfolk, Va. I flew there with a couple of experts from Network General, back when they owned the Sniffer. We instrumented all sorts of things and captured traces galore.
We even went so far as to call the nearby Navy base commander to find out if they were using radar on some of their ships in the harbor. (Try doing that now, with the post-9/11 mindset.) It took hours of hard work to track down what was causing all the havoc: a bad network adapter in someone’s machine.
Here’s another true story: One person who periodically worked late was disconnected from his network at about the same time every night. He thought it was odd, because it happened when he was one of the only people in the office, so the load on the network was minimal.
It took several evenings before he traced down the problem: The cleaning crew was vacuuming the rug in the same room as the server, and the network cable wasn’t properly crimped. As the vacuum ran over the carpet, the cable (which was underneath the carpet—who knows why?) would be pulled slightly, and it disconnected the server. When the cleaning crew was finished, the cable would be fine again.
I remember the time I had a fire in my office building, and my backup tapes were right next to the server. Luckily, my office wasn’t in danger. Now I do off-site backups.
At least I had verified that my backups were actually taking place, which reminds me of a firm that had backup tapes that were essentially devoid of data. Someone hadn’t set up things properly, so nothing was being backed up. No one noticed that the backups took almost no time to complete, until the fateful day when someone needed to retrieve a file and saw that nothing had been copied for several months.
Another time, I was helping a firm in Florida upgrade its very expensive server. It was a piece of hardware called Tricord, which had all sorts of redundant power supplies, redundant networking cards and so forth. I recall that it cost $30,000 or so.
The company needed to bring down the server in order to replace a pair of networking cards, as they were upgrading from 10 Megabit to 100 Megabit Ethernet. (As you can tell from the numbers, this happened a long time ago.) The Tricord was running Netware, and the firm had a very complex installation utilizing Unix, Windows and Mac clients.
We did all sorts of testing beforehand with the new network adapters to make sure that we had our driver act together. Unfortunately, we didn’t. We needed various patches and fixes to get the cards running properly.
Our next hurdle was hardware. We brought down the server, swapped out the cards, got everything in place and proceeded to turn on the server. Then we heard a small pop.
It turned out that the only piece of hardware that wasn’t duplicated was the power cord that went from the box into the AC outlet. The malfunctioning hardware part was where the cord fit into the server, so we needed to replace it. We got the new part, and the network ran a lot faster.
Another upgrade involved a firm in Vancouver, Canada, that was running one of the early versions of peer networking from Lantastic. We were lucky to have a Lantastic developer along during the weekend when we brought down the server. We ran into a problem, and the developer had to ask a colleague to go to the office to track down a bug. We got a new version of the code and were back online on Monday.
Here’s a final story about how hard it is to eliminate all electronic traces when you fire someone. A hospital in Washington, D.C., found out that one of its recently departed staffers was still logging onto the network. They caught him when they saw his girlfriend’s login ID come up on a remote line when she was at work and already logged in from the hospital. It pays to read those logs, folks.
What’s the point of these stories? That managing a network is extremely challenging, and you never really know what’s going to happen next.