Reimagining loadbee's Data Architecture

Reimagining loadbee's Data Architecture

B2B Premium Insights for SaaS Provider

More Insight, More Impact: Loadbee Modernizes With AWS

Increasing demands for data analysis and reporting required loadbee to fundamentally modernize its data architecture. Working with b.telligent, loadbee created a scalable AWS Lake House platform that enables detailed product-level analytics, longer data history, and significantly higher performance - all while reducing costs and driving new revenue opportunities.

Quick Facts About the Project

Map pin icon

Region & Sector: Germany, Information Technology

Building icon

Company size: Medium-sized company

Clock icon

Project duration: 3-6 Months

Folder icon

Project type: Modernization

cog icon

Technology: AWS

About the client

75 %

less effort required to create new dashboards

Initial Situation & Challenge

loadbee is a SaaS provider that integrates A+ and Premium Content from brand manufacturers directly into the product detail pages of online retailers. The brand customers, whose content loadbee delivers, are provided with data around these syndications. Over time the amount of data, as well as the requirements grew, and the initially used tech stack based on Elasticsearch was no longer able to cope for various reasons:

For example, every change in information requirements was labor-intensive and required manual adjustments from tracking through the ETL processes to changing and deploying source code of the loadbee customer portal. Additionally, data granularity was limited to the brand, shop, and daily levels for a maximum of three months of data history. However, customers demanded more detailed analysis options down to the product level and a longer data history. The existing infrastructure not only incurred high maintenance costs but also restricted loadbee’s ability to integrate additional datasets and use cases, such as event analysis.

To overcome those challenges loadbee was looking for a new data architecture and the right processes to build a flexible and high-performance data product for their customers.

Solution

loadbee asked b.telligent to support them in rethinking their analytics platform on AWS. After initial workshops to understand the current situation and gather requirements, b.telligent proposed a new data architecture along with a domain-level data model for the syndication data to tackle the existing problems.

The result was a modern AWS Lakehouse data platform. This architecture combines a Data Lake with a Data Warehouse in a seamlessly integrated solution. The Data Lake serves as the initial ingestion sink and long-term storage of syndication tracking data, as well as data from additional data sources. These data sources included master data about the brands, online shops, and products, as well as event data on syndication content. In the Data Warehouse, syndication and master data are then integrated into a core data schema, which serves as the source for custom aggregates in the presentation layer. The presentation layer serves   as a single data source for the new dashboards.

btelligent Cloud data architecture for loadbee with ETL processes, data warehouse, DBT, and AWS backup
Structure of the loadbee data platform

Data processing and metadata management were implemented using a combination of AWS Glue and dbt. Glue data pipelines ingest the data from source systems and write it into the Data Lake where they are saved in a raw data layer in their original format, JSON or CSV. The data pipeline then processes the raw data into the standardized data layer, which converts all data to a standardized format and saves it as Parquet files. During the process, metadata about the data is collected in the Glue Data Catalog.  

Using the Glue Data Catalog, Redshift Spectrum is used to define a virtual staging layer in the Redshift Serverless-based Data Warehouse. For the data processing and transformation in the Data Warehouse, dbt (data build tool), an open-source data transformation tool, was selected. Using dbt, DWH transformations were developed to integrate data from all sources within a core data layer, along with subsequent tables in the presentation layer, which serve dashboards implemented in Yellowfin BI.

During the implementation of the project, DataOps best practices were introduced, and the infrastructure, as well as the data pipelines, were implemented accordingly. The AWS infrastructure is defined via code in Terraform. Deployments were done exclusively via GitHub CI/CD pipelines. CI/CD pipelines were also used to deploy all data pipelines.

As part of the data processing, data quality tests were established and executed in the automated data pipelines. Monitoring of infrastructure and data processing in the Data Lake is available in AWS Cloud Watch Dashboard, and in the case of data pipeline failures, messages are sent out by email via the notification service AWS SNS.

AWS Service Role in Data Architecture Advantages
AWS Glue (ETL) Connector to source systems and data pipeline Serverless workflows based on Python and Spark
AWS Glue (Data Catalog) Crawling raw data, offering a data catalog Integral part of the data pipelines and integration of Data Lake and Data Warehouse
Amazon S3 Data Lake storage Cost-effective and scalable data storage
Amazon Redshift Serverless Cloud Data Warehouse Serverless, elastic compute engine and scalable storage
AWS Cloud Watch Cloud Monitoring Integrated monitoring of AWS services and infrastructure
AWS SNS Notification Service for data pipelines and Cloud Watch Serverless notification service that sends event-based emails

With the new multi-layered data model approach, it was possible to decouple the enrichment of the tracking data from the actual tracking implementation. Data   from the master data source systems (brands, products or online shops) is now integrated with syndication and event data in the core layer of the Data Warehouse. This enables new ways to analyze the data, as any new fields or information from the master data systems can be used to analyze the syndications and events. To provide the desired responsiveness of the dashboards, use-case-specific aggregate tables were provided in the presentation layer.

Voices From the Project

Quote icon

Revamping our AWS data architecture with b.telligent has transformed our analytics capabilities, enabling us to provide our brand customers with deeper, product-level insights and extended data histories. This upgrade not only reduced the time to create new dashboards but also opens new revenue opportunities and significantly enhances the value we offer.

Christian Renner

Chief Technology Officer at loadbee

Like many B2B portal providers, loadbee faced the challenge of meeting customers' demands for a flexible and high-performance way to analyze data about their services. As a long-time AWS customer, loadbee partnered with us to develop a new data platform, taking full advantage of AWS's native data analytics services. This project leveraged the expertise of our specialized AWS Data & Analytics team at b.telligent.

Ingo Klose

Management Consultant Data Platform & Data Management at b.telligent

b.telligent Services at a Glance

badge icon

Conceptualizing an AWS Lakehouse architecture

Development of a future-proof platform that combines the flexibility of data lakes with the structure of a data warehouse.

badge icon

Implementing ETL pipelines with AWS Glue

Automated extraction, transformation and loading of data via serverless ETL processes - efficient and scalable.

badge icon

Performing data transformation with dbt

Building a robust data model with DBT for structured integration and preparation of data in the data warehouse.

badge icon

Setting up CI/CD pipelines and DataOps practices

End-to-end automation of deployments and tests for stable, maintainable data processes according to DataOps principles.

badge icon

Integrating monitoring and alerting systems

Real-time monitoring with AWS CloudWatch and automated alerts via SNS ensure smooth operation.

badge icon

Reimagining loadbee's Data Architecture

Results & Successes

check icon

Scalable Lake House architecture: The new AWS data platform combines flexibility and performance - with extended data history, optimized data structure and future-proofing.

check icon

Dashboards in record time: Thanks to standardized data models and CI/CD pipelines, the effort required to create new dashboards has been reduced by over 75%.

check icon

New sales potential through data products: loadbee can now market valid analysis data directly - a real added value for brand customers and a new business model.

The new data architecture resulted in several benefits, which are already showing promising results for loadbee and its brand customer base.

loadbee is now able to offer its customers a data product called Premium Insights. It contains new dashboards, where brand customers can analyze data not only at the brand, shop and day levels but also at the product level. Together with the extended data history, one year instead of 90 days, this is a significant amount of detailed syndication data. In addition, dashboards showing view duration and click events help customers optimize their content strategy.

The effort required to create new dashboards based on existing datasets has been reduced by at least 75%. In the best case, it takes less than a day to build a dashboard and have it ready in production. If new information is available about shops, brands, or products, it can be integrated quickly by the data engineers and made available for analytics, without involving loadbee’s portal development team.

The data architecture allows for new kinds of data to be integrated into the platform, as has been done with view duration data on loadbee’s premium content, which allows for more advanced analytics on end-user behavior and helps brands decide which content is popular. Adding this new level of detailed data and analytic capabilities was only possible through the scalable data infrastructure and the integration with the syndication and master data information.

As a side effect, data quality issues, which were discovered during this project, could be fixed in the source systems and therefore enhanced the data quality beyond the data platform into the operational platform.

The ROI for this project is not purely based on cost, although the new infrastructure costs 50% less than the old stack while providing 1,000 times more data about the syndications alone. It also opened up new revenue streams for loadbee, as they sell data as a product to their brand customer base.

The Tech Behind the Success

Amazon Web Services (AWS)

As an Advanced Partner of AWS, b.telligent supports its customers in the migration and setup of data platforms in the AWS cloud. More information here!

read more
Mann unterhält sich lächelnd am Tisch mit einer Frau

Download the Full Story

Want a handy PDF version of our success story? Whether you need it for yourself or to introduce the project to your team, download it now and explore the full success story. Enjoy reading!

Klaus-Dieter Schulze

Klaus-Dieter Schulze

Managing Director

Inspired?

Did our success stories spark your interest? If you're facing similar challenges in data, analytics and AI and look for expert support, let’s talk. A brief call can reveal how we can help you move forward.