EOS: Migrating an On-Premise DWH

EOS: Migrating an On-Premise DWH

From On-Premises to AWS Redshift: Successful DWH Migration with Vision

More Efficient, More Agile, More Sustainable: EOS Is Transforming Its Data Platform!

EOS was faced with the challenge of modernizing its on-premises DWH. The migration to AWS Redshift took place together with b.telligent – for greater scalability, 30% faster data processing and 50% lower operating costs. The result: a powerful, flexible data platform for data-driven growth.

Quick Facts About the Project

Map pin icon

Location & Sector: Germany, Financial Services

Building icon

Company size: Large companies

Clock icon

Project duration: 8 months

Folder icon

Project type: Consulting project I DWH Migration

cog icon

Technologies: AWS Redshift, Ab Initio, Microstrategy, Power BI, Protegrity

About the client

50 %

The new cloud infrastructure reduces running costs by around 50%.

Initial Situation and Challenge

EOS Technology Solutions (TS) is the internal IT service provider of the EOS group. The OTTO group's enterprise is active in the field of receivables management. As a central service company, EOS TS offers a data platform to the companies of the German EOS subsidiary (EOS-DID). In addition to a data lake, this platform includes a DWH, an ETL tool, a BI solution and data-science services. The aim of this project was to migrate the on-premise data platform to the cloud.

Platform Migration to AWS

This project encompassed all three phases of a migration process, from assessment through mobilization to actual migration.

Three-phase project model for cloud migration: 1) Assess (3 months) – feasibility and business case; 2) Mobilize (3 months) – training, architecture, planning; 3) Migration & Modernize (8 months) – technical implementation across infrastructure, data, reporting, and data science with minimal business disruption.

Different migration strategies were pursued for each functional building block of the data platform:

# Category Technical solution Migration technique
1 Data Warehouse Exasol → Redshift refactor
2 Data Processing AbInitio on AWS rehost & refactor
3 BI & Reporting Microstrategy → PowerBI rearchitect
4 PII-Data Security Protegrity on AWS repurchase
5 Data Science Implementation with Data Science Toolstack on AWS (Sagemaker, …) refactor
6 Data Lake & Exploration S3, Athena, External Tables, PowerBI on DWH/Athena refactor / new
7 Landing dates / ingest S3, DMS, on premises Kafka, JDBC/ODBC refactor / architect

In the following, we will discuss the replacement of the Exasol database with AWS Redshift in more detail.

The Problem

DWH migration is always a complex undertaking if you decide against a lift-and-shift strategy and instead pursue re-factoring to a new database engine, as in this case. The following requirements must then be met:

  • Reduce infrastructure costs
  • Maintain and perhaps even improve the performance of loading and querying processes
  • Maximize migration speed in order to minimize the impact on further development and daily business.
  • Ensure continued technical correctness of translated/optimized ETL processes

In particular, the memory limitation of the existent DWH on Exasol (license and hardware capabilities of the on-premise server) was also a major factor prompting replacement with Redshift here.

Solution Approach and Solution Architecture

A decision against a lift-and-shift approach was not clearly evident at the beginning of the project. In a first step, a benchmark was established with comparable data volumes and specific EOS-SQL queries, thus enabling an initial statement about adequate sizing of a Redshift. It turned out that a Redshift with about half the hardware capability (CPU & main memory) can measure up to the Exasol database. At that time, it was already clear that parts of the existent SQL scripts and views of the Redshift database needed to be optimized to achieve a performance comparable to that of the Exasol DB.

This can be explained primarily by architectural differences between the two databases (e.g. sort keys instead of indices, and different behaviour of query optimizers). For cost reasons, a larger variant of the Redshift DB was not an option, and in such cases, it is usually advisable to optimize the software (the SQLs) first. As a search for automation tools for migrating Exasol to Redshift did not yield satisfactory results, it was decided to perform the migration "manually". This was carried out in the following steps:

  1. Inventory of migration objects (SQL scripts and views)
  2. Implementation of a generator (shell script) based on Exasol dictionary tables for:
    a) Exasol table export to S3
    b) table imports from Redshift to S3
    c) Generate table DDLs for the Redshift database
  3. Creation of a “translation list” to replace EXASOL SQL with Redshift SQL
  4. Testing the speed  of a data migration, which turned out to have a reasonable range of around 6 hours, thus eliminating the need for complex data migration using CDC or AWS Snowball.
  5. Implementation of a test framework for a detailed comparison of Exasol data (tables & views) with Redshift data. Detailed statistics (COUNT(*), COUNT(distinct column), MIN, MAX, AVG, LENGTH) were collected and compared for each table/view here. This test framework was based on data dictionaries, and therefore generated automatically.
  6. Continuous translation of SQL scripts and views, accompanied by a comparison of data from both databases.
  7. Parallel operation of the two DWH using the test framework to eliminate residual errors in migration over a period of approx. 4 weeks.

Learnings

A DWH migration is a complex project from which we (b.telligent and EOS) have learned a lot:

Graphic showing four key learnings from EOS's cloud migration project: 1) Exasol SQL translation was the most complex part; 2) smart SQL optimization significantly improved Redshift performance; 3) too many parallel projects overloaded resources; 4) lack of AWS know-how highlighted the need for internal training.

Although such migrations can be packaged into standard procedures (assess, mobilize, migrate & modernize), it is ultimately the details of the existent system and dependencies between functional building blocks which require special attention and expenditure.

Voices From the Project

Quote icon

With the new cloud infrastructure, we are well-positioned for the coming years. The cloud is an enabler for a whole range of national and international issues, such as establishment of a DWH for the entire EOS Group.

Jakob Tewis

Team Lead Data Insights at EOS Group

b.telligent Services at a Glance

badge icon

Cloud Migration Strategy

Development of a tailored migration strategy in three phases — Assess, Mobilize, Migrate

badge icon

DWH Refactoring

Manual SQL translation and optimization for Redshift including test framework

badge icon

CI/CD & Infrastructure as Code

Development of the complete infrastructure via Terraform (IaC) including VPC, IAM & Co

badge icon

ETL Modernization

Migration and optimization of ETL processes (Ab Initio, DMS) and introduction of a new framework

badge icon

BI Realignment

Replacing MicroStrategy with Power BI including data model adjustment

badge icon

Data Security & Governance

Integrate Protegrity on AWS, align with ISO & governance teams

EOS: Migrating an On-Premise DWH

Results & Successes

check icon

Performance boost: Load and query processes run around 30% faster on Redshift than before.

check icon

Significant cost reduction: The cloud infrastructure reduces running costs by around 50%.

check icon

Cloud readiness: Development of a scalable, secure and agile data platform as a basis for international development.

30% Faster, 50% More Cost-Efficient: The Impact of the New DWH Solution

After a very intensive period of 8 months, all ETL processes (2,500 SQL scripts/views) were converted to Redshift, and day processing on Redshift was around 30% faster, with infrastructure costs (Exasol on AWS vs. Redshift) being around 50% lower. With this solution, EOS now has a DWH solution with the following features

  • Usage-based infrastructure costs with potentially "infinite" scaling
  • Significantly higher agility and shorter innovation cycles
  • A comprehensive data platform on AWS offering increased security against failure
  • Successful implementation of a number of "secondary items":
    • Development of a complete AWS infrastructure for DWH, data lake & ETL with code (Terraform)
    • Replacement of Oracle's GoldenGate with AWS DMS for data supply central operative system
    • Migration of Protegrity to AWS
    • Migration of AbInitio to AWS
    • Establishment of AWS infrastructure for machine learning
    • Creation of a new data model for Power BI
    • Migration reports

Outlook: Roadmap for the Further Development of the DWH Platform

A DWH is never really finished, which means that, after a migration with a rather minor development, the following points are now on the roadmap:

  • Implementation of new data use cases (new attributes, new data sources and even new preparations for scores ...)
  • Further optimization of data use cases and ETL processes to allow the existent Redshift cluster to be scaled down further so as to continue reducing infrastructure costs
  • Internationalization of the DWH for the EOS Group
  • Utilization of the cloud infrastructure for expansion into a modern data platform (increased use of data lakes, further expansion of ML workloads ...)
  • Validation of Redshift Serverless as a way to reduce operational effort and costs

The Tech Behind the Success

Amazon Web Services (AWS)

As an Advanced Partner of AWS, b.telligent supports its customers in the migration and setup of data platforms in the AWS cloud. More information here!

read more
Mann unterhält sich lächelnd am Tisch mit einer Frau

Download the Full Story

Want a handy PDF version of our success story? Whether you need it for yourself or to introduce the project to your team, download it now and explore the full success story. Enjoy reading!

Klaus-Dieter Schulze

Klaus-Dieter Schulze

Managing Director

Inspired?

Did our success stories spark your interest? If you're facing similar challenges in data, analytics and AI and look for expert support, let’s talk. A brief call can reveal how we can help you move forward.