Cloud Computing

Migrating a Big Data Warehouse to the Cloud

By Guest Author
Last updated May 1, 2014

By Adam Haines

In today’s world, big data has become the norm. But transitioning big data sets from on-premises systems to a cloud-based system is cumbersome and fraught with challenges.

At Federated Sample, our business focuses on ensuring that the market research industry secures quality sample respondents for its research. The studies often have complex quotas and highly defined targeting requirements, which can make finding the right sample an involved process.

Initially, we stored our data in on-premises Microsoft SQL Server systems. However, as our company grew and requests for information increased, running queries against these transactional systems created a bottleneck. To overcome this challenge, we looked for an end-to-end data warehousing solution that would enable us to increase efficiency and more effectively meet the real-time reporting and business intelligence needs of our customers.

Since Amazon Redshift did not require a significant capital investment and we were already an Amazon Web Services (AWS) shop, that was the logical data warehouse choice for us. But it was only part of the equation; we still needed to migrate data from our Microsoft SQL Server systems to the cloud-based Redshift data warehouse.

As we made our transition to the cloud, the following lessons became clear:

Before you start migrating to the cloud, investigate third-party solutions.

As a cost-effective measure, our team built a homegrown solution that required the use of SQL Server Integration Services (SSIS), with data traveling through multiple stages for encoding and adaptation for direct copying into Redshift. Unfortunately, the solution was labor-intensive and rife with problems: The file-encoding process was error-prone, and only small files could be sent, with no way to identify or track errors. Our initial upload took more than a week, and subsequent updates took hours.

Automate data migration and syncing.

I can’t stress this enough: There is no need to manually migrate your data. Unless you’re working with very small data sets (under 300GB) and small transaction loads, you will regret not automating.

In our case, with our homegrown solution, I was essentially working around the clock to oversee data migration and synchronization. Based on my salary, the cost of our third-party solution paid for itself in less than a month. It was well worth the investment, since I can now focus on higher-value projects.

If possible, do a proof-of-concept (POC) with your cloud provider and other third-party solution providers.

We did a POC with Attunity to be certain that we could load our full database in a reasonable amount of time. That gave our team the confidence to move forward, and it helped us secure the budget for the project.

Migrating a Big Data Warehouse to the Cloud

Before you start migrating to the cloud, investigate third-party solutions.

Automate data migration and syncing.

If possible, do a proof-of-concept (POC) with your cloud provider and other third-party solution providers.

Guest Author

More Stories

Dipping crude oil prices, surging silver, and a reportedly bullish USDCAD

Gold and silver face challenges amid economic optimism

AltStore PAL boosts app diversity in the EU

Wall Street rattled by tech giants’ earnings slump

Proposed retirement age shift sparks debate in China

South Korea overhauls inheritance tax to boost economy

Siri Suggestions widget simplifies iPhone app usage

Software Development Staff Augmentation in 2024

6 Startup Essentials Every New Business Needs to Get Right