Approach
An automatic generation of data and resulting knowledge, as by means of big data methods, might tempt one to consider efforts to improve data quality as less important. This is a wrong conclusion because the requirement of data quality management remains in force, even if the focuses shift.
Two typical real life problems: automated data generation may sporadically lead to incomplete data due to int. al. system failures. These gaps are to be handled in order to avoid systematic distortions. In addition, there often is an incorrect substantive interpretation of analysis results if there is no analysis of processes of the data generation and if conclusions are drawn on the basis of assumptions.
In order to tackle data quality problems, a systematic approach should be pursued. In particular, the analysis and elimination of sources of identified problems is the basis for sustainable success. In a study of data quality management in CRM (Leußer, 2011), the following general causes were identified in the course of a survey of experts in the DACH region:
- Design and operation of business processes and IT systems: these include aspects such as insufficient processes and quality controls and the ongoing operation of IT systems
- Implementation and use of data quality management: Reasons can be a lack of data quality culture in the business or insufficient documentation
- Insufficient data collection: Examples are the limited influence on external data supplies or accidental incorrect data collection/accidental failure to collect data
- Data becomes obsolete (out of date)
These causes can also be applied to big data methods:
The obsolescence of the data is an important topic - especially in case of highly volatile data which often has a short intrinsic value. Exactly for this reason, respective big data methods such as MapReduce were developed in order to be able to quickly generate knowledge. For a quick response to new findings, however, both an appropriately designed architecture and marketing automation systems for communication are required in addition to the analysis. Thus, the cause 'design of IT systems´ mentioned above is also affected.
A common point of criticism of big data analyses is the wrong interpretation of the results. Thus, correlations are often equated to causality or relations such as Facebook friendships are overvalued. The reasons primarily are a lack of knowledge about the handling analysis results and insufficient documentation.
As one can see, the focuses of the causes underlying data quality problems in the context of big data analyses are slightly changing. Of course, this also has effects on the following measures. The study mentioned above generally specifies the following measurement categories for CRM relevant data:
- Category Human Being: such as the sensitization and motivation of the employees and training regarding system operation
- Category Organization: such as the definition of responsibilities for data stocks and their quality
- Category Processes: such as the consideration of data quality requirements in system development and plausibility checks during data collection
- Category Technology: such as the supplement of missing data due to rules.
These categories can also be applied to big data analyses. In some cases, results of these analyses may support the traditional data collection and in so doing close information gaps as data quality problem. Thus, for example, the behaviour on websites has not been a quickly accessible information up to now due to the mass of tracking information. By means of a respective analysis, valuable contents for the interaction and communication can now be generated.
A further possible use of big data methods is in data quality monitoring. In this context, the focus is on the detection of irregularities in the data streams. A corresponding monitoring ensures both the data basis for traditional data and methods, and for big data analyses. In case of irregularities, such as outliers or missing data, notifications are triggered in order to initiate the analysis of causes and measures.
Summary
By means of the use of big data methods, valuable knowledge can be generated for CRM. Particularly in fields such as social networking or searching behaviour, which are not covered by previous data, this enriches the picture of the customer. However, to continually and sustainably transform this knowledge into value requires data quality management. An ongoing data quality monitoring, the documentation of generation processes and the distribution of method knowledge are important measures which should be implemented in every business.