Rx for Data Woes

When Pfizer Research Fellow Dr. Michael Linhares and his team introduced a data integration framework in January 2008, it had to accommodate the company’s rigorous drug development process. Here, Linhares discusses the teamwork that led to the creation of virtual data marts, which were designed to incorporate maximum adaptability into a necessarily rigid, standards-based framework.

Informed decisions bolster our corporate success at every stage of drug discovery, development, manufacturing and distribution. That’s crucial for Pfizer, which works on an average of a hundred potential commercial drug projects annually, with five typically culminating in delivery to market each year.

As the world’s largest drug manufacturer, Pfizer devotes 15 percent ($7.5 billion) of its corporate budget to R&D and employs more than 13,000 R&D employees worldwide. This investment has resulted in such household-name drugs as Lipitor and Viagra.

To discover and develop tomorrow’s pharmaceutical breakthroughs, we continually strive to find and maintain the ideal balance between the rigid processes required to meet high manufacturing standards and the flexibility that’s vital for nourishing creative risk-taking and innovation.

Drug manufacturing is highly complex and stringently regulated, and it requires volumes of data to fuel decision making during the eight to 10 years it typically takes to transform an experimental compound into a viable drug. Having timely, accurate information is particularly critical for our Pharmaceutical Sciences (PharmSci) group, whose analytical executives determine the drugs Pfizer will bring to market. Missteps and delays are costly, averaging $10 million per day.

The information used by PharmSci executives comes from numerous disparate, geographically dispersed sources, such as our global information factory, enterprise project management, inventory and supply chain, portfolio and project management systems. The data in these sources changes constantly as new research is conducted, and new conclusions reached. This complexity is further compounded because drug development is not a linear march from point A to point G, but a constantly moving target.

Here’s a typical scenario that we might discuss: We’ve found a promising new API (active pharmaceutical ingredient), or compound, which we believe could be developed into a new drug or, perhaps, even multiple drugs. We need key planning and resource information.

What is the forecast of potential success or failure? If we predict potential success, how many new resources should be assigned? What are these resources currently working on? We need data to answer these and many other questions. Of course, we need this data by next Tuesday for our business and strategy planning meeting.

In the Business Information Systems group within PharmSci, we deliver data for decision making regarding project and portfolio investments, including resources, expenses and planning. The information’s scope ranges from mapping out, forecasting and budgeting data used for future planning and modeling to working out the details for the project’s execution.

We’ve traditionally relied on formal, repeatable reports containing analytical information from physical data marts and data warehouses. But when our executives need information that’s not readily available from the existing reports—and this happens regularly—the process to extract and deliver the data is time-consuming, costly and cumbersome.

To supplement the formal reports, we produced informal, spreadsheet-based solutions we call spreadmarts. At face value, spreadmarts are a convenient alternative for providing short-term answers because they are one-off spreadsheets generated from hand-entered data from multiple sources and embedded calculations. But spreadmarts contain information that has been separated from its source and, therefore, becomes outdated quickly. They do not conform to controls such as security, and they don’t offer scalability or reuse across various user domains.

Although short-term by definition, spreadmarts actually take weeks to build. And, at the end of the four to six weeks it typically takes to build a spreadmart, there is only a 50 percent chance that we would have successfully arrived at all the facts required. Often, we would have to abandon the process or restart it.

We were determined to find an alternative solution that offered rapid access to up-to-date data, conformed to security controls, offered scalability as our information needs grew and enabled reuse.