Skip to main content

Talking to software engineers

Yesterday I talked to software engineers about data quality.

It went well!

Specifically, it was about the responsibilities they have, as data producers, when they migrate a data contract to a new version.

Although our tooling for data contracts is good, and people are creating new versions for breaking changes, unless they following a good change management process they can still cause problems for downstream consumers.

For example, if they delete the old version of the data contract before consumers have migrated off it, their applications will still break.

Or if they fail to backfill the data, then the value of that dataset for analytics and data science is much reduced.

Some of our software engineering teams do this well. Some don’t.

So, I raised it with our group of principal software engineers, the most senior software engineers at our organisation.

This is how I spoke about the problem.

I reminded them what data contracts are, by saying:

Data contracts are an interface for data. Similar to how APIs are used as an interface between services, data contracts are the interface for moving and managing large amounts, of data so other teams can consume that data, including engineering, BI and Data Science

So already I’m speaking their language. I was off to a good start.

I then spoke about the problem.

Like every interface, there is some change management required when making changes to it. Recently we’ve noticed some examples where teams didn’t follow the change management process, and these were the issues it caused.

After explaining the impact of the issues, in particular how that impacted the business, I spoke about what I did to try to solve the problem.

I’ve written some documentation on how the changes should be managed, including how to identify data consumers using the tools we have.

I’ve also asked our Data Engineering team to raise incidents when unexpected braking changes cause them issues, and assign that incident to the team who owns the data.

The second point is important. We have a good incident culture, and when things break we expect an incident to be raised. Once the incident is resolved, we expect teams to learn from it and take actions to prevent the same incident occurring again.

As an extra benefit for us, this raises the visibility of the problem. If we keep having the same type of incidents, people are going to notice, and they’re going to ask why.

I finished by asking the software engineers to keep an eye out for breaking changes and ask their teams to consider the impact on data consumers before making the change.

The software engineers were in complete agreement with what I’d ask.

They understood why it was important to have good change management on our data.

We then had another 15mins of great discussion on what else we could do, mainly around the topic of visibility and awareness of this problem.

A lot of people in data say you can’t talk to software engineers. They’ll never care about data quality.

But if I can, you can.

There’s nothing special about me that allowed me to do this. I just prepared my message well and went for it.

What’s stopping you from talking to your software engineers?


Want great, practical advice on implementing data mesh, data products and data contracts?

In my weekly newsletter I share with you an original post and links to what's new and cool in the world of data mesh, data products, and data contracts.

I also include a little pun, because why not? 😅

Enter your best email here:

    (Don’t worry—I hate spam, too, and I’ll NEVER share your email address with anyone!)

    Andrew Jones
    Author
    Andrew Jones
    I build data platforms that reduce risk and drive revenue.