The Pieces of the Puzzle: Some Assembly Required

I spent several years as chief information officer at Video Monitoring Services of America, which monitors and captures every piece of news or advertising on television, radio and the Internet.

Data collection was unbelievable: collecting analog video and audio tapes, 24 hours a day, seven days a week. But that wasn’t the only issue; storage and retrieval was a huge concern. We had to have all the material indexed and to know where it was. For 14 offices, there were 14 data centers, 14 libraries of tapes that took up room after room. There was a common index, but abstracts of each item had to be manually entered. We had about a terabyte of digital information already—but we expected to need space for four times that much within 18 months.

It’s logical to split massive projects into smaller pieces. The trick is to know precisely what size chunk to bite off, and how to put them all together again on the other side. This video project was no different. Cost was a concern, but the overall goal could be broken down: reducing time-to-delivery and labor costs; eliminating redundant equipment and floor space for analog inventory; competitive advantage. The benefits from each piece made the project’s capital cost—$4 million total for 51 media markets—more acceptable.

But could we place servers in each market to digitally encode content? No one had done it at the scale and low price point we needed. So we even had to break our needs into individual pieces. We needed something inexpensive and reliable, an unattended server sitting at a remote location. It had to have a video-capture board and a foolproof operating system. We needed a reliable database and enough disk space to capture a huge amount of bytes continuously. We needed a high-speed connection, a T1 line for each server.

This wasn’t something we could just purchase. We had to build it ourselves, buying it in pieces and assembling it. Some pieces were built in Russia, others elsewhere. We opted for Linux servers—each with its own modem, dual video boards, two network interface cards and two 200GB disk drives, and running on Pentium III chips with 2MB of cache. In the end, each one cost us $5,500.

We began to ship pieces of content across the Internet to a central facility in a store-and-forward technique. We broke the content into small chunks, brought them over the Internet and reconstructed them, which was more difficult. We had to work out our own algorithms to see how slowly we could capture raw, unedited video while maintaining acceptable quality. Even with our initial five-gigabyte pipe, we had issues with capacity on the receiving side, capturing all of this digital video—we actually had to slow the process down; otherwise we’d have had to put in faster servers and more-expensive Internet connections.

To determine the right packet size, we had to consider bandwidth and the rate of capture. We knew we had to offload the machines capturing the content at a certain speed in order to not to overload them. So we went in with a number, and played on each side of it until we settled on three minutes per packet. Determining packet size was a metaphor for the overall project: We were able to decide which pieces worked and which ones had to be reengineered before we put it all together.

For example, we couldn’t slow down the amount of incoming video, so we slowed down the retransmission to our central location instead. That meant sending the video in parts. It also meant increasing the storage at each remote location. Fortunately, the price of disk space was constantly dropping. Where the original plan was for just two days’ worth of storage, we built in a buffer large enough to hold three or four days of video.

If we’d sent all the video at the same time, we would’ve been looking at huge bandwidth needs at every location. Each server would’ve needed the most storage, the most transmission speed—the most everything. By breaking it up into pieces, we were able to find the minimum amount needed in each spot.

-Written With Joshua Weinberger