Migration to the cloud: How does this work in practice?

Answers to the most frequently asked questions about cloud migration

  • Article
  • Data Engineering
  • Data projects
migration to the cloud
Dennis Dickmann
Dennis Dickmann
Data Engineer
5 min
08 Jan 2024

In the past, all data from companies was stored locally in an on-premise environment. More and more companies are migrating their data infrastructure to the cloud. Cloud computing utilises servers managed and maintained by cloud service providers such as Amazon Web Services, Microsoft Azure, and Google Cloud Platform. In this article, you will read the answers to the questions you may have when considering a migration to the cloud.

Why are an increasing number of organisations migrating their data infrastructure to the cloud?

There are several reasons for this:

  1. The development of the cloud has progressed so rapidly that various advanced functionalities are challenging to achieve on-premise. This is because on-premise environments require a significant amount of knowledge and expertise. Upgrades to new software, frameworks, and technologies are also necessary. Modern tools and techniques are primarily offered on a cloud-first basis. Additionally, expanding hardware may be necessary when adding new functionalities, requiring a considerably higher number of Full-Time Equivalents (FTEs) for environment management.
  2. The cloud is relatively accessible and cost-effective. The time-to-production is shorter, system setups are simpler, and there is more dynamic and flexible access to resources. For instance, if you need extra GPU capacity for a short period, you can rent it from the cloud provider for an hour. On-premise, you would need to purchase the actual hardware (GPU) for this purpose.
  3. Cloud providers benefit from significant economies of scale. As a result, they can offer complete services, including security, user management, logging, and monitoring, much more affordably than if you were to do it yourself. Additionally, due to their scale, they can provide many more functionalities that you can easily deploy yourself, reducing dependence on engineers.
  4. Cloud solutions generally guarantee a minimum uptime of 99.9%. To achieve the same result on-premise, you would need a large team with extensive expertise.

What options are available for storing your data in the cloud?

If you want to process your data for analytical purposes in a data platform, you need to store the data. The structure in which you store and bring together the data is called the data model.

A data lake or data warehouse serves as a storage location for data, typically not (directly) used by operational systems.

A data warehouse has a structured method of storage, whereas a data lake provides much more freedom. However, this doesn't make a data warehouse 'better' than a data lake, just different.

Often, a data lake is better suited for storing raw, relatively unprocessed data. Nowadays, a data lake house architecture is also frequently chosen. This is a setup where a separate processing engine runs alongside the data lake, which writes back the results of the transformations.

Read more about data storage here.

What does a cloud migration look like in practice?

The way you migrate your data infrastructure from an on-premise environment to the cloud depends on two factors: the scale of the solution and the complexity of the environment. The more significant the migration, the greater the need for a thorough analysis beforehand, as described below in the discovery phase.

We recommend breaking down your cloud migration project into three phases:

1. Discovery Phase

  • Analyse your existing processes. Which processes are analytical, transactional, or operational? To what extent is each process business-critical? The higher the degree, the greater the impact and risk.
  • Explore cloud solutions that can replace your current on-premise infrastructure. You can choose between two main options: Compute services, such as virtual machines that you largely set up and manage yourself, or managed services that take over maintenance and management tasks. With managed cloud services, you need to invest more in reprogramming your processes, but they work quickly and intuitively, allowing you to go into production swiftly. Opting for self-managed compute services will keep you closer to your on-premise setup, with a lower initial investment, though you may not fully leverage the cloud solution.
  • Align your migration project with the guidelines of the internal IT & Operations department. If there are none, establish them. Security and governance play a crucial role. Who has access to which data? Who is responsible? Who has decision-making authority?

Check out our webinar on data governance.

Based on the discovery phase, create a step-by-step plan with a broad scope of the project. Other deliverables from this phase include a technical design and an overview of all technical and functional requirements. When creating the technical design, you'll encounter critical choices: will you build and manage everything yourself, or will you use third-party tools like Data Factory, Fivetran, and DBT? Opting for such tools requires less technically advanced knowledge, making your migration faster and simpler. However, you should consider ongoing subscription costs or pay-per-use. Moreover, if you keep everything in-house, your project becomes much more complex.

2. Proof of Concept (POC) or Minimum Viable Product (MVP) Phase

In this optional phase, you work on a small scale to address the most uncertain factor of your project. You test its functionality and try to prove your assumptions. If it works and aligns, you can build on it. If not, you can make adjustments early on.

While a PoC is used solely to demonstrate the effectiveness of the technology and the technical design, an MVP is already a fully functional product but on a small scale. For instance, a PoC may not need to contain production data, while an MVP usually does.

3. Implementation Phase

Once there is a final agreement, your technical team can set up the infrastructure. Continuous collaboration with end-users and ongoing discussions with stakeholders are crucial during this phase. Your engineering team must not only be technically proficient but also adept at explaining how certain things work. After all, you want to avoid a scenario where your new platform doesn't align with the end-users' needs.

What are the key conditions for a successful migration?

Clear communication with end-users and stakeholders is essential. Therefore, you need a competent Product Owner or Project Manager who understands both the technical aspects and the end-users' perspective. Additionally, working with professionals who have executed similar projects and faced comparable challenges is crucial. An experienced specialist prevents reinventing the wheel and can foresee potential issues in advance. Knowing not only how things should be done but also how they should not be done is vital. Identifying errors early on saves a significant amount of time and money.

What are the risks and drawbacks of a cloud migration?

During a migration to the cloud, you are relocating a part of your company's infrastructure. Challenges may arise during the migration, as in any other IT project. It's important to remain flexible in finding solutions and keep stakeholders engaged, meaning working in an agile manner. This flexibility is largely dependent on the project's size and the initially estimated complexity. Integrating a medium-sized database or data warehouse typically won't present many unforeseen complexities.

When a new data model needs to be implemented alongside the migration, it must be well-documented. The time for design and implementation varies based on the model's complexity, making it challenging to scope.

Consider that after the migration, your on-premise specialists may need to be retrained. If not feasible, you might need to hire new personnel or outsource the service. Note that when using managed services in the cloud, there is relatively less maintenance work, often focusing more on maintaining the data model, where data expertise becomes more critical than IT.

Who are the stakeholders, and who performs the migration?

The primary stakeholders include your company's IT organisation, data end-users, and senior management. The CEO and CTO are mainly involved in the decision to migrate to the cloud but don't play a role in the execution. IT support is crucial for aspects like providing system access and advising on the most efficient connection to local data sources.

The project team for a cloud migration ideally consists of:

  • A project manager overseeing the entire project, skilled in stakeholder management, expectation management, and change management.
  • A Solution Architect or Cloud Architect determining the necessary infrastructure.
  • One or more specialists with technical experience executing the implementation. The overarching term 'Data Engineer' is often used in the market for this role. In practice, you need someone specialized in areas such as DevOps, platform engineering, or a Python programmer.

For structuring a data warehouse, an Analytics Engineer is necessary, while a Data Analyst can assist in setting up the right dashboards for your business inquiries.

Read a practical case study here.

What does a migration to the cloud cost?

The license for a cloud solution is more expensive than the on-premise variant. However, this creates a distorted image because, in the long run, you will spend less money on a cloud platform for the following two reasons:

  • Cloud solutions provide standard services, including security, user management, and login monitoring. Maintenance and management tasks are also largely automated, resulting in savings on Full-Time Equivalents (FTEs).
  • All costs associated with managing your on-premise infrastructure disappear. For example, you will no longer incur costs for hardware.

The costs to the cloud provider depend on various factors, such as the amount of data, the refresh rate of source data, the complexity of the data model, and the frequency and complexity of executed queries.

The migration project itself incurs costs as well. The size of your investment depends heavily on the requirements, the scale of the solution, and the complexity of the environment. If you plan to revise your data model or connect new sources, the investment may increase. Cloud migration is often linked to a business question arising from a need for more insight into data or better technical performance.

From a technical perspective, you can execute a cloud migration for a few thousand euros. However, achieving the desired added value and fully meeting the guidelines of your organisation, such as security or data quality, may require additional investments.

Need help with your cloud migration?

If you want insights into possible costs and would like advice on your migration project, we are happy to have a conversation

Q&A about cloud migrations

Want to discuss the possibilities for your organisation?

schedule an online meeting

Dennis can help you!

Dennis Dickmann is a Data and Software Engineer with over 5 years of experience. Representing Digital Power, he worked for organisations such as ANWB and Ikea, building and maintaining large data platforms.

Dennis Dickmann

Data Engineer

Want to know more?

Oskar van den Berg is a Data Engineer and Team Lead with 7 years of experience at Digital Power. With his work experience as a Data Engineer in both startup and enterprise organisations, he can advise clients on technical and strategic levels.

Oskar van den Berg

Data Engineeroskar.vandenberg@digital-power.com

Receive data insights, use cases and behind-the-scenes peeks once a month?


Sign up for our email list and stay 'up to data':

You might find this interesting too

Data quality: the foundation for effective data-driven work

Data projects often need to deliver results quickly. The field is relatively new, and to gain support, it must first prove its value. As a result, many organisations build data solutions without giving much thought to their robustness, often overlooking data quality. What are the risks if your data quality is not in order, and how can you improve it? Find the answers to the key questions about data quality in this article.

Read more
business managers having a conversation

Insight into the complete sales funnel thanks to a data warehouse with dbt

Our consultants log the assignments they take on for our clients in our ERP system AFAS. In our CRM system HubSpot, we can see all the information relevant before signing a collaboration agreement. When we close a deal, all the information from HubSpot automatically transfers to AFAS. So, HubSpot is mainly used for the process before entering a collaboration, while AFAS is used for the subsequent phase. To tighten our people's planning and improve our financial forecasts, we decided to set up a data warehouse to integrate data from both data sources.

Read more
people working together

The all-round profile of the modern data engineer

Since the field of big data emerged, many elements of the modern data stack became the data engineers' responsibility. What are these elements, and how should you build your data team?

Read more

What is machine learning operations (MLOps)?

Bringing machine learning models to production has proven to be a complex task in practice. MLOps assists organisations that want to develop and maintain models themselves in ensuring the quality and continuity. Read this article and get answers to the most frequently asked questions on this topic.

Read more

Webinar: Data Governance

In this webinar, we discuss the maturity model that we apply to quantify the maturity of different dimensions of data governance. Additionally, we provide concrete steps and implementation tips to start providing added value through data management.

Read more
elevator

20% fewer complaints thanks to data-driven maintenance reports

An essential part of Otis's business operations is the maintenance of their elevators. To time this effectively and proactively inform customers about the status of their elevator, Otis wanted to implement continuous monitoring. They saw great potential in predictive maintenance and remote maintenance.

Read more
potatoes

Valuable insights from Microsoft Dynamics 365

Agrico is a cooperative of potato growers. They cultivate potatoes for various purposes such as consumption and planting future crops. These potatoes are exported worldwide through various subsidiaries. All logistical and operational data is stored in their ERP system, Microsoft Dynamics 365. Due to the complexity of this system with its many features, the data is not suitable for direct use in reporting. Agrico asked us to help make their ERP data understandable and develop clear reports.

Read more

Kubernetes-based event-driven autoscaling with KEDA: a practical guide

This article explains the essence of Kubernetes Event Driven Autoscaling (KEDA). Subsequently, we configure a local development environment enabling the demonstration of KEDA using Docker and Minikube. Following this, we expound upon the scenario that will be implemented to showcase KEDA, and we guide through each step of this scenario. By the end of the article, you will have a clear understanding of what KEDA entails and how they can personally implement an architecture with KEDA.

Read more

AWS (Amazon Web Services) vs GCP (Google Cloud Platform) for Apache Airflow

This article provides a comparison between these two managed services Cloud Composer & MWAA. This will help you understand the similarities, differences, and factors to consider when choosing them. Note that there are other good options when it comes to hosting a managed airflow implementation, such as the one offered by Microsoft Azure. The two being compared in this article are chosen due to my hands-on experience using both managed services and their respective ecosystems.

Read more
Analists working on GA4

Transitioning from Universal Analytics 360 to Google Analytics 4 and Streamlining Data Analysis

There are currently a lot of developments surrounding Google Analytics, including user privacy (GDPR) and the sunset of Universal Analytics. For Miele X, the digital branch of Miele, GA4 was also one of the topics on their agenda as part of their bigger plans towards a more privacy-centric and vendor-agnostic way of data collection. They enlisted our help to support them with the transition from Universal Analytics 360 to GA4.

Read more
kadaster header

Working more efficiently thanks to migration to Databricks

The Kadaster manages complex (geo)data, including all real estate in the Netherlands. All data is stored and processed using an on-premise data warehouse in Postgres. They rely on an IT partner for maintaining this warehouse. The Kadaster aims to save costs and work more efficiently by migrating to a Databricks environment. They asked us to assist in implementing this data lakehouse in the Microsoft Azure Cloud.

Read more