Primer: Storage Partitioning

  • What is it?
    A set of digital instructions that makes it easier to manage the increasingly large amounts of data found on farms of inexpensive disk drives. At its simplest, it’s an administrative technique for dividing a disk or an array of disks into clearly defined chunks of storage space that can be assigned to one or more servers each. The phrase itself is rarely used, however, because it’s so integral to other storage-management schemes; usually you’ll hear about “storage consolidation” instead.

  • Why should I care?
    Because storage capacity is increasing faster than the ability to manage it. Without partitioning, you can forget managing your data library efficiently.

  • What do you mean by ‘consolidation’?
    Companies consolidate storage to get rid of the little pools of disk space attached to each server. While it’s good to have storage for each machine, it’s not always easy to predict the amount of disk space each will need. A server in Finance, for example, might connect to a drive with 150 gigabytes of disk space, but only take up 100GB with its data. Managers in Operations, however, might be asking for 50 gigabytes of additional space after maxing out the 200-gigabyte disk in their own server. It’s very hard, however, to give Operations access to Finance’s excess capacity, so the company ends up buying an additional 50GB for Operations, while still not using the 50GB of empty disk in Finance.

  • How does it work?
    In consolidating, companies throw out storage attached to individual servers in favor of big honking arrays accessible to all the servers. That big pool of storage space can then be divided, by the operating software in the arrays, into chunks that can be assigned to each server. When a server needs more space, administrators can expand its chunk by changing the numbers in a configuration screen, without physically touching either the array or the server. “The theory is that you get economies of scale by going to a larger subsystem, but you can still divide it up into smaller pieces that are usable,” says Dianne McAdam, senior analyst at Data Mobility Group in Nashua, N.H.

  • What’s the downside?
    Managing the access rights of the servers to these arrays of disks gets complicated as the number of disks and servers increases. Storage area networks (SAN) and network-attached storage (NAS) systems have emerged to make it easier to create ever-larger pools of storage that are still manageable.

    NAS boxes, as their name implies, attach to a network and add a lot of capacity—as much as several trillion bytes of data. Until recently, however, NAS boxes had a limited ability to expand, meaning when one was maxed out, the owner had no choice but to buy another box and reassign servers so that some would use the first box and some the second. In general, a company that loves its first and second NAS boxes loathes its tenth because of the additional work required to manage each.

    Newer NAS boxes can connect more easily to one another to share space and ease administration, but not as easily as storage area networks, which are designed specifically to link many storage devices into one enormous pool that can range into the tens of trillions of bytes. Setting up those connections on the SAN controller is complex, however, because administrators have to assign a port on the SAN controller to each server, as well as the ports the storage arrays can use to respond.

  • What’s next?
    NAS and SAN systems, long incompatible, are coming together rapidly as it becomes clearer that large companies need the drop-in usability of a NAS box with the flexible capacity of the SAN. The two are merging, but the process is far from complete. The next major storage-system update, which will trickle down from the most expensive systems available today, is likely to be the ability to prioritize requests so the most time-sensitive applications get data first.