Skip to main content

Data Contracts


📙 Looking for more on data contracts? Check out my book! 📙

2024


You broke our data, so your PRs now need our signoff

·1 min

“You broke our data, so your PRs now need our signoff.”

This is a common reaction from data teams who are feeling the impact of upstream data changes causing breakages in their pipelines.

Data products and data contracts

·1 min

Data contracts underpin data products.

With data contracts we are explicitly saying data should be treated as a product by those teams who produce the data. That data is then provided through a stable interface.

Active vs passive data publishing

·2 mins

Passive data publishing is the norm in most organisations.

That’s where we’re using patterns like ELT or CDC to extract copies of the upstream database and replicate them in a data warehouse/lake. The data producer isn’t doing anything to facilitate this - they are passive.

How to version data contracts

·2 mins

How, exactly, should you version data contracts?

The default answer is often to use SemVer.

SemVer is a standard from software engineering and used widely to version libraries and releases of software applications.


Want great, practical advice on implementing data mesh, data products and data contracts?

In my weekly newsletter I share with you an original post and links to what's new and cool in the world of data mesh, data products, and data contracts.

I also include a little pun, because why not? 😅

    Newsletter

    (Don’t worry—I hate spam, too, and I’ll NEVER share your email address with anyone!)