OHSU Takes a Surgical Approach to Cancer Data

The growing volume, variety and velocity of data (the so-called 3Vs) are redefining almost every organization across every industry. However, over the next few years, no field is likely to undergo a greater transformation than health care and medicine, where outcomes and lives are constantly on the line.

“Precision medicine is increasingly becoming a data-driven science,” says Adam Margolin, director of computational biology at the Oregon Health Sciences University (OHSU) School of Medicine in Portland, Ore. “The ability to use detailed molecular information and interpret it in the context of hundreds of millions of patients is at the center of medicine.”

OHSU hopes to transform the concept into reality by 2020. This would provide real-world benefits and results, including better and more effective diagnosis and treatment modalities.

“We want to be able to characterize their disease—the type of cancer they have—and identify the best treatment for that person based on genetics and how previous patients have responded to different treatments,” Margolin explains. Studies and data focused on this area are beginning to accumulate, and “The ability to sort through all the data is critical. If we’re successful in learning from the data and putting it to work, it’s possible to make breakthrough advances.”

The facility turned to Intel to assist with the initiative, dubbed the “Collaborative Cancer Cloud.” OHSU is employing a Trusted Analytics Platform (TAP) to collect, manage and study health data collected from a variety of sources.

Processing Data From a Variety of Sources

Although the project initially focused on a custom data analysis platform for OHSU, it has since evolved into a broader initiative that involves several open-source software components and a move toward a reference architecture that incorporates multiple components stitched together from both OHSU systems and public clouds. Among other things, this allows scientists to collect and process data from a variety of sources, including watches, wearables, legacy databases and leading-edge genomic research—all while pushing the envelope on performance.

The project focuses on four key technology pillars. Federated workflow orchestration is at the center of the initiative, according to Margolin.

“We will not build out a central database to store and manage all the data,” he says. “Instead, we will allow the data to remain under institutional control. That way, users can interact with data as if it’s one logical data set.”

In addition, OHSU will rely on data acceleration techniques and image processing pipelines that tackle huge data sets—particularly imaging data.

The fourth pillar is a heavy emphasis on data privacy. “Any data shared through the network allows the owner to maintain very tight controls over who and what accesses the data,” Margolin reports.

Finally, the data platform will be highly scalable. It incorporates the use of Hadoop, Cloud Foundry and NoSQL databases such as Apache HBase.

APIs will serve as a critical component to tying together a diverse array of IT systems and data sources. OHSU is working closely with the Global Alliance for Genomics & Health to develop standards and data interoperability methods that will allow data to flow across the health care community.

“The goal is to introduce data-driven precision medicine that saves lives,” Margolin says. “This initiative could have a profound effect on treating cancer, as well as other genetic syndromes, including autism, neuropsychiatric diseases and heart diseases.”