2024
If you want to improve the quality of the data
Staging layers, medallion architectures, data testing, assigning data stewards, gatekeeping application changes until reviewed by a data team.
In a response to my LinkedIn post on how every data transform is technical debt, Tim Hiebenthal commented:
Data quality can only be improved at the source.
As I wrote yesterday, many data professionals don’t trust the data they are building on.
At most of my recent talks I’ve asked the audience - who are made up of data professionals - a simple question: Do you trust your data?
2023
While yesterday I wrote about prioritising data quality projects, I think it’s important to note that while for many of us the quality does need to be improved, we don’t need to be aiming for perfect.
It’s likely going to be difficult to get a project prioritised if the goal is defined only as “improving data quality”.
If you’re building key product features on top of your data, shouldn’t it be as dependable as your software-backed features?
No matter what we do, when working at sufficient scale and/or sufficient speed it’s inevitable that things will go wrong.