Skip to main content

How to version data contracts

·2 mins

How, exactly, should you version data contracts?

The default answer is often to use SemVer.

SemVer is a standard from software engineering and used widely to version libraries and releases of software applications.

And when we were designing our implementation of data contracts we originally planned to use SemVer, as it seemed obvious that would be a good idea.

But then, we thought “what is a patch version, and would a consumer care?”.

We concluded that patch versions would be changes to documentation, categorisation, access controls, etc, but we didn’t see why we needed to keep track of that beyond what we already get from the git history. We also didn’t see why a consumer would care.

So we didn’t use patch versions.

We then applied the same thought exercise to the minor version. Those changes would be non-breaking changes, but if they are non-breaking why would an existing user need to care? What would they do with that information? They wouldn’t need to make any changes to their code, unless (for example) they wanted the new column, in which case they could start using it whenever they needed to. And again, we have the history in Git if needed.

So we didn’t use minor versions either.

We ended up just using major versions, for breaking changes. That was the only place where it was clear that the version number was communicating something consumers would care about.

I don’t know for certain that was the best decision, but after 3 years in production it hasn’t caused any issues as yet!

And it has kept things simple. There’s no need to ask the producer to maintain a version file, incrementing the version on every change. And it’s easy for all consumers to understand when and why they should care about a version change - including less technical consumers.


Want great, practical advice on implementing data mesh, data products and data contracts?

In my weekly newsletter I share with you an original post and links to what's new and cool in the world of data mesh, data products, and data contracts.

I also include a little pun, because why not? 😅

(Don’t worry—I hate spam, too, and I’ll NEVER share your email address with anyone!)


Andrew Jones
Author
Andrew Jones
I build data platforms that reduce risk and drive revenue.