


Reimagining loadbee's Data Architecture
B2B Premium Insights for SaaS Provider
Initial Situation & Challenge
loadbee is a SaaS provider that integrates A+ and Premium Content from brand manufacturers directly into the product detail pages of online retailers. The brand customers, whose content loadbee delivers, are provided with data around these syndications. Over time the amount of data, as well as the requirements grew, and the initially used tech stack based on Elasticsearch was no longer able to cope for various reasons:
For example, every change in information requirements was labor-intensive and required manual adjustments from tracking through the ETL processes to changing and deploying source code of the loadbee customer portal. Additionally, data granularity was limited to the brand, shop, and daily levels for a maximum of three months of data history. However, customers demanded more detailed analysis options down to the product level and a longer data history. The existing infrastructure not only incurred high maintenance costs but also restricted loadbee’s ability to integrate additional datasets and use cases, such as event analysis.
To overcome those challenges loadbee was looking for a new data architecture and the right processes to build a flexible and high-performance data product for their customers.
Solution
loadbee asked b.telligent to support them in rethinking their analytics platform on AWS. After initial workshops to understand the current situation and gather requirements, b.telligent proposed a new data architecture along with a domain-level data model for the syndication data to tackle the existing problems.
The result was a modern AWS Lakehouse data platform. This architecture combines a Data Lake with a Data Warehouse in a seamlessly integrated solution. The Data Lake serves as the initial ingestion sink and long-term storage of syndication tracking data, as well as data from additional data sources. These data sources included master data about the brands, online shops, and products, as well as event data on syndication content. In the Data Warehouse, syndication and master data are then integrated into a core data schema, which serves as the source for custom aggregates in the presentation layer. The presentation layer serves as a single data source for the new dashboards.

Data processing and metadata management were implemented using a combination of AWS Glue and dbt. Glue data pipelines ingest the data from source systems and write it into the Data Lake where they are saved in a raw data layer in their original format, JSON or CSV. The data pipeline then processes the raw data into the standardized data layer, which converts all data to a standardized format and saves it as Parquet files. During the process, metadata about the data is collected in the Glue Data Catalog.
Using the Glue Data Catalog, Redshift Spectrum is used to define a virtual staging layer in the Redshift Serverless-based Data Warehouse. For the data processing and transformation in the Data Warehouse, dbt (data build tool), an open-source data transformation tool, was selected. Using dbt, DWH transformations were developed to integrate data from all sources within a core data layer, along with subsequent tables in the presentation layer, which serve dashboards implemented in Yellowfin BI.
During the implementation of the project, DataOps best practices were introduced, and the infrastructure, as well as the data pipelines, were implemented accordingly. The AWS infrastructure is defined via code in Terraform. Deployments were done exclusively via GitHub CI/CD pipelines. CI/CD pipelines were also used to deploy all data pipelines.
As part of the data processing, data quality tests were established and executed in the automated data pipelines. Monitoring of infrastructure and data processing in the Data Lake is available in AWS Cloud Watch Dashboard, and in the case of data pipeline failures, messages are sent out by email via the notification service AWS SNS.
With the new multi-layered data model approach, it was possible to decouple the enrichment of the tracking data from the actual tracking implementation. Data from the master data source systems (brands, products or online shops) is now integrated with syndication and event data in the core layer of the Data Warehouse. This enables new ways to analyze the data, as any new fields or information from the master data systems can be used to analyze the syndications and events. To provide the desired responsiveness of the dashboards, use-case-specific aggregate tables were provided in the presentation layer.
Voices From the Project
b.telligent Services at a Glance
Conceptualizing an AWS Lakehouse architecture
Development of a future-proof platform that combines the flexibility of data lakes with the structure of a data warehouse.
Implementing ETL pipelines with AWS Glue
Automated extraction, transformation and loading of data via serverless ETL processes - efficient and scalable.
Performing data transformation with dbt
Building a robust data model with DBT for structured integration and preparation of data in the data warehouse.
Setting up CI/CD pipelines and DataOps practices
End-to-end automation of deployments and tests for stable, maintainable data processes according to DataOps principles.
Integrating monitoring and alerting systems
Real-time monitoring with AWS CloudWatch and automated alerts via SNS ensure smooth operation.

Results & Successes
Scalable Lake House architecture: The new AWS data platform combines flexibility and performance - with extended data history, optimized data structure and future-proofing.
Dashboards in record time: Thanks to standardized data models and CI/CD pipelines, the effort required to create new dashboards has been reduced by over 75%.
New sales potential through data products: loadbee can now market valid analysis data directly - a real added value for brand customers and a new business model.
The new data architecture resulted in several benefits, which are already showing promising results for loadbee and its brand customer base.
loadbee is now able to offer its customers a data product called Premium Insights. It contains new dashboards, where brand customers can analyze data not only at the brand, shop and day levels but also at the product level. Together with the extended data history, one year instead of 90 days, this is a significant amount of detailed syndication data. In addition, dashboards showing view duration and click events help customers optimize their content strategy.
The effort required to create new dashboards based on existing datasets has been reduced by at least 75%. In the best case, it takes less than a day to build a dashboard and have it ready in production. If new information is available about shops, brands, or products, it can be integrated quickly by the data engineers and made available for analytics, without involving loadbee’s portal development team.
The data architecture allows for new kinds of data to be integrated into the platform, as has been done with view duration data on loadbee’s premium content, which allows for more advanced analytics on end-user behavior and helps brands decide which content is popular. Adding this new level of detailed data and analytic capabilities was only possible through the scalable data infrastructure and the integration with the syndication and master data information.
As a side effect, data quality issues, which were discovered during this project, could be fixed in the source systems and therefore enhanced the data quality beyond the data platform into the operational platform.
The ROI for this project is not purely based on cost, although the new infrastructure costs 50% less than the old stack while providing 1,000 times more data about the syndications alone. It also opened up new revenue streams for loadbee, as they sell data as a product to their brand customer base.
The Tech Behind the Success

Amazon Web Services (AWS)
As an Advanced Partner of AWS, b.telligent supports its customers in the migration and setup of data platforms in the AWS cloud. More information here!

Download the Full Story
Want a handy PDF version of our success story? Whether you need it for yourself or to introduce the project to your team, download it now and explore the full success story. Enjoy reading!
Inspired?
Did our success stories spark your interest? If you're facing similar challenges in data, analytics and AI and look for expert support, let’s talk. A brief call can reveal how we can help you move forward.
