Skip to main content

Being reactive with data quality

·2 mins

Most of the time we’re reacting to data quality issues.

Maybe someone has made a change to their database schema, and since we’re pulling that into our data warehouse directly from their database that breaks everything we’ve built. Or maybe the business logic has changed upstream, and we had our own version of that logic built on the data warehouse that has fallen out of sync.

But each of these issues has a cost. There’s the cost of our time to resolve the issue, when we could be doing something more valuable. There’s the cost from our users who don’t have the data they need. It also costs us some of the trust we have with our users, as each time there is an issue they question whether they can depend on that data or not. That lost trust is very difficult - and very costly! - to win back.

Some might have said that cost is ok if it’s “just dashboards” (although, if we’re paying so much for these dashboards, they must be important!). But it’s certainly not ok if we’re building customer-facing applications built on the same data.

And that’s the reality for many data teams now. They find people are building increasingly important products on their data, without realising just how fragile the data is.

It’s time to start being a little more proactive.


Want great, practical advice on implementing data mesh, data products and data contracts?

In my weekly newsletter I share with you an original post and links to what's new and cool in the world of data mesh, data products, and data contracts.

I also include a little pun, because why not? 😅

(Don’t worry—I hate spam, too, and I’ll NEVER share your email address with anyone!)


Andrew Jones
Author
Andrew Jones
I build data platforms that reduce risk and drive revenue.