Managing Big Data in the Cloud

By Bob Violino Print this article Print
cloud computing

Two of today's hottest technology trends—big data and cloud computing—converge as enterprises seek to get a handle on their growing volumes of information.

Collecting Data From Around the World

Another organization using big data in the cloud is the Virginia Bioinformatics Institute (VBI), a research institute in Blacksburg, Va. VBI conducts genome analysis and DNA sequencing using about 100 terabytes of data that's collected each week from around the world.

"Our largest project is the downloading and reanalysis of every sequenced human genome to identify new biomarkers and drug targets, especially for cancer," says Skip Garner, executive director and professor at VBI. "We are analyzing approximately 100 genomes per day, and these are all downloaded from the cloud."

Data generated from various scientific sources is downloaded and then analyzed on VBI servers. "Recently, it has become easier and more efficient to download what we need and not keep local copies, for it amounts to tens of petabytes," Garner says. "So the cloud has enabled us to download, use and throw away raw data to save space, and then download again if necessary."

The institute hasn't used non-cloud compute resources for the research work because its codes "are memory hogs, requiring servers with at least a terabyte of RAM," he explains.

Managing big data in the cloud does come with challenges, Garner points out. The big issues are security and intellectual property. For example, VBI has permission to download certain data sets, and, in those agreements, it must maintain control, allowing only certain people to have access to the data.

"We can be absolutely sure of where the data is when it is in our servers, and we are confident that we are adhering to the terms of agreements," Garner says. "That is not [the case] when data is in the cloud. So, currently, we do not put data in the cloud, we only download."

Downloading and using data from the cloud saves VBI a lot on storage costs, and the return on investment was "immediate", according to Garner.

As organizations approach big data, their first choice for compute and storage platforms should be the cloud, says Chris Smith, U.S. federal chief technology and innovation officer at New York-based Accenture, a global management consulting company.

"Low cost, highly scalable and elastic capabilities are the right formula for implementing big data," Smith says. "In some cases, a big data solution in a highly secure environment may dictate an internal data center [strategy], but most organizations are developing their own internal private clouds, and this is the right place for those specific solutions as well."

Organizations continue to adopt and implement private, public and hybrid clouds, "with these technologies having become mainstream choices for developing new capabilities," Smith says. "I expect to see increased and even more rapid adoption over the next 18 to 24 months."

As organizations increase the breadth and depth of business technology offerings in the cloud, Smith says, they need to ensure that they can manage information across multiple heterogeneous environments, in order to be able to clearly develop, analyze and articulate the state of business, as well as provide highly available, high-performing services that deliver value.

"A robust cloud brokering and orchestration capability that puts the organization in the driver's seat to maintain, deliver and innovate new and better services will be key for the enterprise," Smith says.

The cloud itself will continue to generate lots of data, says London-based research firm Ovum. In "2013 Trends to Watch: Cloud Computing," the firm says that 2013 will see cloud computing continue to grow rapidly. Cloud computing in all its types—public, private and hybrid—is building momentum, evolving fast and becoming increasingly enterprise-grade, Ovum says.

Cloud computing services—and the social and mobile applications that cloud platforms underpin—are generating a lot of data, which, in turn, requires cloud services and applications to make sense of it, Ovum notes.

This trend is fueling other industry trends, such as the Internet of things (machine-to-machine communication and data processing), consumerization of IT and big data.

This article was originally published on 2013-02-21

Bob Violino is a contributing writer to Baseline and editorial director at Victory Business Communications.

eWeek eWeek

Have the latest technology news and resources emailed to you everyday.