Link bad data and failure of digital transformation
With a pizza, the generic base is just as important as the different fillings on it. After all, a bad base ruins any pizza, no matter how good and tasty the content is. The same applies to data and applications. High-quality data is the starting point – and therefore the bottom – of every application. After all, the following applies to everything: ‘garbage in, garbage out’. The better you can qualify the departure dates, the higher the quality of the raw material for the application, the better each output of that application will be. Process-oriented quality assurance starts with the data input. Data are the raw material, the building blocks and piles for the foundation – the soil – of our information management.
Poor data quality is the main cause of failure of digital transformations. Therefore, companies should prioritize data transformation. After all, if the transformation is not set up data-driven, the new information management system will be built on quicksand. Especially if we want to benefit from AI, big data and machine learning in that digital transformation. You can invest millions in data lakes, clouds, data scientists and Chief Data Officers, if the source data is and remains of poor quality, that is money wasted.
Yet many transformations fail. And not simply failure, but spectacular! Uncontrolled, cauliflower and hardly to stop. Numerous reports and studies show that more than 80 percent of big data projects fail. Enough has been written lately about how corporate cultures and unchecked ambitions lead to failed big data projects. Here I focus on how poor data quality is overlooked and accounts for one of the main causes of failure of digital transformations.
Data transformation, the process of transforming raw data into a good quality usable format, is often incorrectly placed outside digital transformation projects. Companies assume that because they implement data lakes, new clouds, new data centers or new applications, they will automatically transform their data. That’s a dangerous assumption. The new ERP that your company implemented six months ago is not driving operational processes because data issues in the legacy system were not addressed. The new CRM your marketing team has invested in to gain deep customer insights isn’t delivering the expected ROI because the team lacks a data governance or data quality framework.
Understanding the difference between digital and data transformation can help you avoid costly mistakes. To be data-driven, organizations must start by understanding their data, resolving inconsistencies, and transforming their data. Digital transformation is the end of the process – data transformation is the beginning!
What are the common stumbling blocks we often encounter in digital transformation? Data was hidden in various sources. Often also technically different systems with different data structures. The larger the company, the more likely that data is stored in many different databases, giving the organization a disproportionate and inaccurate understanding of their data. Data classification can help here to organize the data logically and bring it together again.
With people entering data manually, there is always a high chance of bad data. A human-dependent data collection process will always be the main cause of data quality issues. A typo, a contextual understanding of a name or location, a missed number, etc. are all minor cases that affect the quality of the data over time. Unfortunately, we often encounter many long-standing data environments that have never been cleaned for the errors present. People often do not even know what the data quality is of the data they possess.
A company may collect the same consumer data for multiple purposes. Year after year, the same data is captured in a hundred different ways in many dispersed data environments. An insurance company struggled with annual reporting because of duplicate data that would be collected over the months. A retailer had to postpone its business expansion plans by six months because their data didn’t paint the right picture. After all, which data is true? Which data is correct if four prices are found for a product price?
Data That Doesn’t Provide a Unified Source of Truth: A bank struggled to create personalized experiences for their customers because each of their services (loan, mortgage, small business loans, insurance, etc.) had its own data sources. Customer information was replicated time and again as they used different services of the bank. Without a consolidated view of their customers, the bank was unable to understand the customer’s journey and deliver personal experiences. Customer-centric working is only possible if the customer data is organized data-centric. And nowadays customer data is even distributed over different clouds and we hardly know which data is in which cloud.
Data unprepared for business intelligence: Data cleansing is a technical ETL (Extract, Transform, Load) process, but with real-world impact. Data that has not been prepared – i.e. not cleaned or optimized – cannot be used for business intelligence. If a company hopes to gain competitive opportunities or key audience insights, they can’t do it with incomplete, inaccurate, outdated, duplicate data.
Just like other raw materials, the quality and correct composition of data is crucial for a process and the resulting end product. Certainly if that process becomes automatic, a well-managed raw material quality is essential to be able to guarantee the process quality. In the capital goods industry, that’s a no-brainer. In the non-professional world of data households, unfortunately (often not yet). Applications and cloud services are sold as the panacea to meet all process challenges. Unfortunately they forget to tell you that that is only correct if the raw material is pure and correct. Because application and cloud suppliers are not responsible for that. That’s the customer who has to do that.