Skip to main content

Flexible data contracts

·2 mins

When it comes to data contracts I’m always talking about well structured data.

That’s deliberate, as we use data contracts to make data available to others (teams/groups/domains), and the aim is to provide data that is easy to consume with confidence.

If the data is less structured (more flexible) it is going to be difficult to consume, and you’ll have less confidence in it.

As a consumer, you now have to know how to parse it. So the producer needs to document the parsing rules clearly for the consumer to make use it.

Those parsing rules then change at any time, likely without your knowledge, breaking your application.

For example, I’ve seen some data contracts that have a field like event_type. Depending on the value of that field, some fields will be populated, and some will not.

E.g. if event_type is a then fields x, y, z will be populated, if b, only x will be, etc.

That might work ok for a while, but at some point those parsing rules will need to change.

Those rules are not captured in a schema, so the change management you’ve implemented with data contracts cannot catch that breaking change.

That leads to the same kind of incidents you had before data contracts.

Here’s the thing:

Data contracts deliberately force producers and consumers of data to apply more discipline to the production and management of data. That is harder, but by reducing incidents and increasing the quality of data we have a better outcome at less cost.

Trying to make data contracts flexible removes those benefits.


Want great, practical advice on implementing data mesh, data products and data contracts?

In my weekly newsletter I share with you an original post and links to what's new and cool in the world of data mesh, data products, and data contracts.

I also include a little pun, because why not? 😅

Enter your best email here:

    (Don’t worry—I hate spam, too, and I’ll NEVER share your email address with anyone!)

    Andrew Jones
    Author
    Andrew Jones
    I build data platforms that reduce risk and drive revenue.