Skip to main content

Data contracts anti-pattern #2: Checkbox compliance

Keep your data contracts lean and useful

Hello!

This is the second in my series of data contracts anti-patterns, following on from last week’s post on contracts as documentation.

Again, this one looks like progress, but in reality has no impact.

There are also links to articles on data platform incident management and encoding your data expert.

Finally I had a fun discussion on What every Beginner Should Know About Data Governance with 10Alytics earlier this week, and you can watch the recording on LinkedIn.


Data contracts anti-pattern #2: Checkbox compliance

You’ve set up data contracts and have your validation checks in place, and now you ask your data producers to complete them.

However, you find the data producers are filling in the required contract fields quickly, with whatever passes validation. Most of it is copy/paste (is_pii: false on every field, no descriptions, no quality rules), and no useful context has actually been provided.

This happens because data producers are being asked to fill in fields for purposes they can’t see. The ask is “fill in this YAML” rather than “here is what we’ll do with each field, here’s why it matters, here’s a library that validates locally so you catch mistakes before CI does.”

A simple cartoon illustration showing a data team member shouting "Populate your data contract! All fields must be present" at a data producer who is thinking "Why? What does it do? I'll do the minimal to get them off my back...". There is a document labeled "Data Contract" in the center with annotations in red: "No impact" pointing to the left and "Copy/paste, no rules, no context" pointing to the right.

These placeholder contracts are actually harder to fix than missing contracts. It’s not obvious which context is correct and which is not, and the platform can only assume it is correct when it runs some automation based off it.

Before asking producers to fill in any fields, show them what the platform will do with their context. This gives them a reason to provide it accurately and keep it up to date as the data evolves.

Once the platform can act on a field, there’s a visible consequence when it’s wrong.


How I Made My Data Platform’s Failures Public and Earned My Stakeholders’ Trust by Yordan Ivanov

Treat your data platform as a product by tracking incidents, having a status page, and so on.

Encoding Your Domain Expert: The Context Layer Behind Spotify’s Data Assistant by Pavlina Mitsou and Jonathan Warburton

Interesting findings on how important it is to ask a domain expert to help curate the context.


Being punny 😅

The most disapproving of all the Pharaohs was King Tut.


Thanks! If you’d like to support my work…

Thanks for reading this weeks newsletter — always appreciated!

If you’d like to support my work consider buying my book, Driving Data Quality with Data Contracts, or if you have it already please leave a review on Amazon.

Enjoy your weekend.

Andrew


Want great, practical advice on implementing data mesh, data products and data contracts?

In my weekly newsletter I share with you an original post and links to what's new and cool in the world of data mesh, data products, and data contracts.

I also include a little pun, because why not? 😅

    Newsletter

    (Don’t worry—I hate spam, too, and I’ll NEVER share your email address with anyone!)


    Andrew Jones
    Author
    Andrew Jones
    I build data platforms that reduce risk and drive revenue.