The Idea of Data Mesh
The data mesh is based on the concept of managing a company's data no longer via a central data department, but via decentralized teams which produce and use the data. For this to work, four principles have been established:
- Domain ownership – decentralized and distributed responsibility:
Each decentralized department team has the right and responsibility to define, collect, store, maintain and publish its own data. This means that each team should be able to independently manage their data products and decide how these can be used by other teams within the enterprise.
- Approach to data as a product:
For Marty Cagan, products must be valuable, usable and feasible. This also applies to data products. Furthermore, these should be developed, tested, documented, deployed and maintained like software products.
- Self-service data infrastructure as a platform:
In order to support decentralized teams in their work with data content, a central team provides a platform for handling technical aspects such as data discovery, data integration, API management, data-quality control and monitoring.
- Distributed, automated governance:
Due to a fundamentally decentralized responsibility for data products, responsibility for governance must also be assumed decentrally. Various aspects are automated zo avoid reliance solely on organizational instructions. Security and data protection are the foremost candidates here.
The data mesh addresses precisely this proximity to business through decentralization and application-specific data products. In this context, there are not only citizen data analysts who carry out decentralized evaluations, but also IT-savvy business analysts for decentralized data engineering. In addition, a single source of truth viewed as a holy grail was previously sought via a central, integrated data model as well as central governance. Both these items have now been dispensed with. Does this mean we will lose a single source of truth? Are we aware of this? Are we ready for this? In my estimation, we are now striving for a higher degree of truth. A business-oriented truth which can be used directly for business processes.
Actuality of Data Mesh
I currently see the data mesh as being particularly relevant for large organizations. A central model is very well suited for smaller companies at the moment.
- Critical skills are bundled in a single team.
- At a small organization, a central team can guarantee proximity to business.
- A small, central team does not have the coordination problems/overhead of large central teams.
As technical support improves, the data mesh will also become more relevant for smaller organizations. It's just a matter of time.
In March, I was a moderator at a Google event on data meshes. With two interesting and experienced speakers - Dr. Anna Hannemann and Peter Kühni - also present here, 30 clients discussed their expectations and challenges in their daily use of data meshes. Clearly, many expect to be able to better manage growth and gain further relevance with the help of data meshes. Among the challenges here is the data mesh's complexity which demands consistency across different teams as well as more communication and technical support. The first tools for technical support are already available on the market, but not yet established and still complicated in nature.