The importance of data quality

The far-reaching consequences when documentation is not in order

  • Article
  • Data Analytics
Chantal-data-analyst
Chantal van der Jagt
Lead Data Analyst
9 min
10 Jun 2020

Are you going to make decisions based on data? Then you have to ensure that your data quality is in order. Good documentation according to a clear process is essential here. Why and how? You can read it in this article.

Improving healthcare through data 

UMC Utrecht has been experimenting with Data Science in psychiatry for some time now. A video shows them using Data Science to find a way to make the unpredictable psychiatric patient more predictable.

The hospital uses, among other things, personal and sensitive data from wearables, social data and input from the patient and close relatives, including word usage analyses. For example, the algorithm can be enriched with contextual data that can provide much information about possible triggers in psychiatric patients.

The advantages mentioned are countless: a better and faster diagnosis, better connection with medication and the ability to provide preventive care. But also less administrative work for doctors and nurses, with potentially fewer healthcare staff needed in response to the shortages in healthcare staff. Huge improvements, provided that the use of this type of data complies with privacy legislation (GDPR).

But what if the data contains errors? 

UMC Utrecht's experiment is a striking example. And this is not the only healthcare institution that aims to preserve the dignity of an individual for as long as possible with the best possible care.

But what if small errors slip into the data? Historical data may be subjective. Data input could be incomplete and be missing relevant data from other healthcare institutions and government agencies (transfer). Data definitions may vary per source. Data may be misinterpreted.

Data errors, however small, can potentially have disastrous consequences for the well-being of patients. If misdiagnosed, the disease could worsen or other disorders could develop. As a result, the quality of life for these patients declines so much that the objective, namely good care, has the opposite effect.

The algorithm does not increase human dignity, it reduces it. The potency of this alone is horrifying enough. This is a scenario that the EU is trying so hard to prevent through the AVG. Hence, it argues that transparency is necessary for organisations engaged in data collection and its application.

Focus on data quality 

If you as an organisation know that poor, or even slightly less good, data quality can have far-reaching negative consequences for your end customer (in this case, the individual patient), you will want to do everything you can to prevent this and incorporate protocols. that guarantee quality.

Documenting data: a crucial task that is often not (fully) performed 

To guarantee your data quality, you need transparent processes and clear protocols within your organisation. In practice, the reproducible documenting of data is often the last and least rewarding step in a process.

Because work is done at a fast pace, this is in fact the step that is often performed incompletely or skipped entirely. But what does that actually mean?

To illustrate, see three different types of results below:

datakwaliteit

If you build well, you have a house. And based on reliable patient data, a doctor can make the correct diagnosis. With the help of customer data, you can offer a personalied offer.

But what if a mistake slips into the process steps? The result will then look a lot less rosy. Is the end result still reliable enough? Below you can see what can change if the data is not completely correct:

datakwaliteit

From the first two examples, we can all vividly imagine the steps to be taken and the result. In the UMC example, the impact would be large and negative if the process steps were not followed carefully and thoughtfully. It would result in even sicker patients due to incorrect input or processing.

In the building example, a miscalculation of a span between two load-bearing walls could have a dramatic outcome. If the house is not built exactly according to design and construction, it is not a house but a series of windows, sockets and walls without any plan. Is it still a house?

The latter result is just as painful, but less visible. It is often an accepted margin of error, and the results seem to correspond reasonably with what was envisioned. But if the design and instruction contain an error, it will affect the result.

Ask yourself the following questions:

  • "Is it less of a problem if a part of the budget is spent on initiatives and developments that you do nothing with in the end?" 
  • "Is it less of a problem if it the why, how and when data was created is not reproducible?" 
  • "Is it less of a problem if experiments are so unreliable in terms of data input that a random guess would probably have the same effect?" 

Chances are that any self-respecting organisation will definitely not want this.

There is too much focus on visible results 

Yet with Lean, Agile and scrum working methods, this tends to sneak in quickly. In which appreciation is mainly expressed in the delivery of visible results. That is where the gratitude from stakeholders lies. They want to see results, because we always do it for them. If they are happy, then what's the problem?!

Besides the fact that this is not smart and it potentially costs you a lot of money, it is unethical.You are delivering unsound results. Not just for yourself and your colleagues (think of the extra work it takes to cross-check everything and come up with workarounds). But also for your organisation: after all, it invests in this fruitless use of time.

The end user, your customer, is also disadvantaged. This is because they are subsequently saddled with a result in which they are not the focal point. For example, they are placed in a segment where they don't belong, and the service they receive does not (optimally) match their customer behaviour.

Documenting: how it should be done 

In the long run, following several clear documenting steps can save a lot of frustration and fruitless use of time. It takes effort now to reap the benefits later.

Good documenting and clear data processes create transparent organisation-wide definitions of metrics, rules used and the context in which they should be used. It ensures that all departments speak the same language and understand each other. This makes Key Performance Indicators (KPIs), targets, and operational action points logical and well-substantiated for all the organisational layers that work with the data. It provides a foundation that you can build on together.

Documenting: a step-by-step plan 

Are you convinced of the need for good documenting? We will be happy to help you on your way with a step-by-step plan that will help you include documenting as a process component in your workflow.

stappenplan

Business requirements 

This describes a desired result and stems from what the business stakeholders need to know. The required information for business stakeholders should align with the organisational and department strategy and objectives.

It is a phase that requires a great deal of coordination with all stakeholders in order to ultimately arrive at jointly supported business requirements. Spend most of your time on this.

Data layer documenting 

This forms the design and the instruction of variables that must be read from the website data layer and subsequently measured in dimensions and/or metrics of the measurement solution.

Measurement solution documenting 

This forms the design and the instruction of the measurment solution. It describes which dimensions, variables, metrics, business rules, and definitions should be used to set up the necessary measurement of the business requirements.

Description of the analytics model 

This describes the definitive measurement solution. What dimensions, variables, metrics, business rules and definitions were applied to create the measurement? It is important that clear terms are used here. It forms the reference work to be consulted by the organisation.

Analysis examples 

This unambiguously indicates which type of reports and analyses can be done using this analytics model.

Are you behind in terms of documenting?

You may be dealing with overdue documenting maintenance. In that case, as with bug fixes, you can allocate a dedicated number of hours per sprint to retrospective documenting:

Start with rounds along the fields using the check question whether the business requirements and current measurements are still aligned;

Using these outcomes, proceed as follows:

  • Start with the good news and document all measurements that still meet the business requirements but are not yet/incompletely documented;
  • Follow up with measurements where there is a misalignment between business requirements and the current measurement solution and add this as a Request For Change (RFC) to the backlog;
  • End with measurements that no longer appear in the business requirements. These will have to be phased out.

Get started! 

Are you going to document (better) after reading this article? Always prioritise according to business impact. This way, you work step by step on good documenting, and you continuously improve the data quality. Would like some assistance with this? Please contact us.

This is an article by Chantal van der Jagt, Data Analytics Lead Consultant at Digital Power

Chantal has helped quite a number of organisations at Digital Power. She has experience from both a strategic and operational point of view in how strategy, data and working methods must be aligned in order to achieve an integrated data-driven strategy.

Chantal van der Jagt

Lead Data Analystchantal.vanderjagt@digital-power.com

Receive data insights, use cases and behind-the-scenes peeks once a month?


Sign up for our email list and stay 'up to data':