Data quality: the foundation for effective data-driven work

Answering key questions about data quality

  • Article
  • Data Engineering
Kasper Soekarjo
Kasper Soekarjo
Solution Architect
5 min
01 Aug 2023

Data projects often need to deliver results quickly. The field is relatively new, and to gain support, it must first prove its value. As a result, many organisations build data solutions without giving much thought to their robustness, often overlooking data quality. What are the risks if your data quality is not in order, and how can you improve it? Find the answers to the key questions about data quality in this article.

How important is data quality?

To determine the importance of having your data quality in order, you must view your data in the context of your business processes:

  • Why do we need this data?
  • What do we use it for?
  • Which processes rely on it?
  • What value does it provide?

In short, what value does it bring? From this starting point, you can consider the potential consequences of having errors in your data, such as missing values and inaccurate data. When you see your data in that context, the question is no longer 'Should I think about data quality?' but 'Can I afford not to?'

A recent study has shown that the average number of data quality incidents, the time required to detect them, and the time to resolve them have all increased compared to the previous year. At the same time, the estimated impact of data quality issues on both business stakeholders and revenue has increased due to the growth of data-driven working.

What are the risks and consequences of poor data quality?

It is complex to assess the risks of poor data quality. This requires good communication between the business and technical sides of your organisation. The business knows the value of certain data and how processes run, while technical experts often have better insights into data quality. Together, they can determine which data is critical and where data quality must be impeccable.

By having these discussions, you can identify areas for improvement and also discover where you can save time and money by not investing in higher data quality. For data used in critical decisions or processes, you must be certain that it is accurate. If a dataset is primarily used for trend analysis, minor data quality issues may be less concerning. For example, consider the difference between critical data used by governments, pension funds, and banks versus data like retail sales and customer service data.

Assess the risks and consequences by asking a few questions:

  • What could go wrong?
  • What are the consequences if things go wrong?
  • How likely is it to go wrong?

Always adapt your data quality strategy based on the type of data and the risks involved. The investment to get data quality in order and ensure its consistency should be proportionate. Additionally, make sure to document your decisions regarding data quality.

Data quality in order: what opportunities does it bring?

When decisions are based on accurate data, there's greater confidence in the choices made. The risk of making incorrect decisions based on erroneous information is reduced. Furthermore, relying on demonstrably correct data enhances support and trust for data-driven decisions.

With effective monitoring of data quality, you can detect issues earlier. Knowing exactly which process or source produces low-quality data allows you to address the problem at its root cause, improving both data quality and the underlying processes.

Other benefits of implementing data quality management include:

  • Cost savings by preventing time-consuming data issues
  • Prevention of financial and reputational damage
  • More accurate and efficient analyses
  • Scalable and robust systems
  • Increased revenue or profit through better decisions
  • Compliance with data management laws and regulations

How to develop a data quality strategy?

Step 1 - Start with an assessment.

Before determining your strategy, assess where your organization stands. Begin by answering questions such as:

  • What motivates us to work on data quality?
  • What is our maturity level in terms of data quality?
  • How transparent is our data landscape and its data lineage?
  • Is there an overview of available datasets?
  • Who uses them?
  • Who is responsible for specific datasets?
  • Do existing guidelines on data quality exist, and are they followed?
  • Are there processes to monitor data quality and address problems at the source?

Several models can assist in this process. Look for one that suits your organisation and has logical steps within your context. If you're looking for a framework, the Data Management Body of Knowledge by the Data Management Association (DAMA-DMBOK2) is the most widely used framework worldwide. In the chapter on data quality, it outlines the essential dimensions of data quality and provides a list of actions to define and implement a data quality strategy.

Step 2 - An overview of your as-is and to-be.

Determine the current state of data management (as-is) within your organisation, where you want to go (to-be), and why. Create a business case that explains the value of having data quality in order. Provide insight into the risks and opportunities. You can develop new business cases or explore where higher data quality can add value to existing ones.

Step 3 - Set the budget and the extent of improvement.

Data quality doesn't need to go from 0 to 100 immediately; incremental progress is possible. Critically assess the investment and potential returns. View improving data quality as a continuous process. Ideally, considering data quality should always be part of designing new business processes, systems, and solutions. Having a data strategy in place helps identify the necessary steps for data quality.

How to find time to address data quality?

Skilled data professionals are scarce, and data teams are often busy. To optimise the technical capacity of your team, consider the following tips:

Tip 1: Appoint responsible individuals for datasets within your organisation.

Currently, data quality often falls under the responsibility of Data Engineers. However, ideally, a technical person should not be solely responsible for a dataset. The content must align with the business process it supports. Technical colleagues can build checks and monitor processes, but they may not always be best positioned to assess the required data quality level. Business stakeholders who are intimately familiar with the business processes that this data supports are in a better position to evaluate this.

By doing so, technical individuals, such as Data Engineers, can focus on their expertise: designing and implementing data flows and ensuring that defined quality checks are correctly implemented in the system.

Tip 2: Utilize helpful tools and services.

To make the work of scarce technical professionals more efficient, various tools and services are being developed to set up data quality checks relatively easily. These tools can be used to check data quality at specific moments as well as track quality over time. Are there missing values? Are the correct data types used? Where does the data come from (data lineage), and what is available (data catalogs)?

Examples of tools widely used for data quality implementation are Soda, Great Expectations, and dbt. The right tool choice depends on the design of your data applications, the technologies used, and your data team's expertise. For introducing and maintaining data quality checks, the crucial tools must align well with the current systems and workflow.

Another significant development is the use of generative AI. Several data quality services are starting to adopt this approach. With AI, you can define checks in regular language, and the tool generates the check for you. This simplifies programmatic tasks.

However, someone with technical knowledge must verify that the correct tests are implemented and that a test performs its intended function. We foresee this expertise becoming even more valuable as data quality is more widely implemented.

Should improving data quality be a higher priority?

Improving data quality within your organisation might not have been at the top of your priority list before you read this article. After all, your data team is already quite busy! However, it's essential to realize that having incorrect data can be time-consuming.

As your pipelines and data landscape become more complex and extensive, it becomes increasingly challenging and time-consuming to detect and resolve data quality issues. All this time spent on data quality issues takes away from new developments that could have a more significant impact and business value.

By getting your data quality processes in order, you can minimize this effort and develop a more robust, scalable, and easily maintainable system. Improving your data quality requires an investment of time, but it can ultimately save you a lot of time in the long run.

We're here to help!

Is data quality high on your agenda? Our data consultants can assist you at every step of the process. This includes designing your data strategy, analysing your data flows, and determining the data quality requirements that suit your needs.

We can also handle the technical implementation for you. We'll conduct checks on your existing data warehouse and establish data quality and governance within your current environment or set up an entirely new data infrastructure for you. Additionally, we're more than happy to help you choose the right tools or services to achieve your desired level of data quality.

This is an article by Kasper Soekarjo, Solution Architect at Digital Power

This is an article by Kasper Soekarjo, Solution Architect at Digital Power. He brings extensive knowledge in data engineering together with strong social skills. This enables him to design technical solutions in close collaboration with clients that precisely align with both technical and business requirements of the project.

Kasper Soekarjo

Solution Architect

Receive data insights, use cases and behind-the-scenes peeks once a month?


Sign up for our email list and stay 'up to data':

You may find this interesting too

implementing a data platform

Implementing a data platform

Based on our know-how, the purpose of this blog is to transmit our knowledge and experience to the community by describing guidelines for implementing a data platform in an organisation. We understand that the specific needs of every organisation are different, that they will have an impact on the technologies used and that a single architecture satisfying all of them makes no sense. So, in this blog we will keep it as general as we can.

Read more
people working together

The all-round profile of the modern data engineer

Since the field of big data emerged, many elements of the modern data stack became the data engineers' responsibility. What are these elements, and how should you build your data team?

Read more
lake

Improved data quality thanks to a new data pipeline

At Royal HaskoningDHV, the number of requests from customers with Data Engineering issues continue to climb. The new department they have set up for this, is growing. So they asked us to temporarily offer their Data Engineering team more capacity. One of the issues we offered help with involved the Aa en Maas Water Authority.

Read more

The importance of data quality

Are you going to make decisions based on data? Then you have to ensure that your data quality is in order. Good documentation according to a clear process is essential here. Why and how? You can read it in this article.

Read more
dutch highway

Reliable reporting using robust Python code

The National Road Traffic Data Portal (NDW) is a valuable resource for municipalities, provinces, and the national government to gain insight into traffic flows and improve infrastructure efficiency.

Read more
Data Engineer at work

Senior Data Engineer

Work on challenging technical assignments with various clients and keep abreast of developments in your field.

Read more
data engineer Oskar having a conversation

5 questions for Data Engineer Oskar

In this video, you will find out what a job as a Data Engineer looks like! What does a working week look like, which clients do our Data Engineers work for and what makes working so much fun? Oskar likes to tell you more about it!

Read more
woman shopping online

A standardised way of processing data using dbt

One of the largest online shops in the Netherlands wanted to develop a standardised way of data processing within one of its data teams. All data was stored in the scalable cloud data warehouse Google BigQuery. Large amounts of data were available within this platform regarding orders, products, marketing, returns, customer cases and partners.

Read more

How do I become a Data Engineer?

A few years ago, the job title didn't even exist: Data Engineer. Nowadays, there is a high demand for Data Engineers. Almost every organisation consciously collects data, and the realisation that this must be done in a structured way is growing. If the data you collect is not well organised and correct, you cannot use it as input for making good decisions. Data Engineers build infrastructures that process data. Therefore, they are indispensable to organisations that want to collect and apply their data in a structured way.

Read more
valk exclusief

Setting up a future-proof data infrastructure

Valk Exclusief is a chain of 4-star+ hotels with 43 hotels in the Netherlands. The hotel chain wants to offer guests a personal experience, both in the hotel and online.

Read more
data platform

A scalable data platform in Azure

TM Forum, an alliance of over 850 global companies, engaged our company as a data partner to identify and solve data-related challenges.

Read more

A fully automated data import pipeline

Stichting Donateursbelangen aims to strengthen trust between donors and charities. They believe that that trust is based on collecting money honestly, openly, transparently and respectfully. At the same time effectively using the raised donation funds to make an impact. To further this goal, Stichting Donateursbelangen wants to share information about charities with donors through their own search engine.

Read more