What do you want from your data?
What do you want from your data?
Do you want it to be fast changing?
What do you want from your data?
Do you want it to be fast changing?
You try your best to work around the poor quality data you’re given.
Only to deliver a poor outcome to your users.
I enjoyed this post from Nicole Radziwill, PhD on LinkedIn:
How fragile are your pipelines? Start with this simple metric: COUNT THE JOINS. Every time you have to join, you’re making multiple assumptions about the underlying raw data, the biggest one being: you’re assuming it’s not going to change.
If you’re a software engineer, and an upstream dependency is unreliable, then you would speak to the team who owns that dependency.
If you want to improve the quality of the data
Then you’ll need to speak to the producer of the data.
Staging layers, medallion architectures, data testing, assigning data stewards, gatekeeping application changes until reviewed by a data team.
In a response to my LinkedIn post on how every data transform is technical debt, Tim Hiebenthal commented:
I totally agree with your statements, but I have doubts about the feasibility of implementing it.
Data quality can only be improved at the source.
If the source of the data isn’t capturing the data at the required accuracy, there’s nothing you can do later to increase the accuracy.
As I wrote yesterday, many data professionals don’t trust the data they are building on. And many users of data and data applications don’t trust the data they’re being provided.
At most of my recent talks I’ve asked the audience - who are made up of data professionals - a simple question: Do you trust your data?
Want great, practical advice on implementing data mesh, data products and data contracts?
In my weekly newsletter I share with you an original post and links to what's new and cool in the world of data mesh, data products, and data contracts.
I also include a little pun, because why not? 😅
(Don’t worry—I hate spam, too, and I’ll NEVER share your email address with anyone!)