Skip to main content

Defining data products with data contracts

·3 mins

Happy Friday everyone!

This is part 3 of my 5 part series on implementing data mesh, and today we’re looking at defining data products with data contracts. See also:

Also links on enabling enterprise data flow and the 13 software engineering laws.


Defining data products with data contracts

Probably the most widely adopted principle from data mesh is treated data as a product.

Although, it is a concept that is far from unique to data, and has been applied to many other areas over the last few years.

One great example is internal platforms, as described by Manuel Pais (co-author of the excellent Team Topologies book) in this talk and article.

Treating platforms as a product means building platforms that are self-served, accessible, and compelling to internal users. You achieve by collaborating with the users and focusing on solving real needs.

Treating data as a product is much the same. As data producers we want to provide datasets that are self-served, accessible, and compelling to internal users. And to do that we need to collaborate with those users to understand their requirements.

This collaboration is a two-way street. The data consumers need to provide their requirements, they need to clearly articulate the value of this data, and often it’s up to them to foster this collaboration — particularly if it’s us acting as data consumers and talking to software engineers or other data producers upstream.

Then the data producers will start to understand their responsibilities for this data, it’s quality, how changes need to be managed, and other expectations.

We can capture these requirements and agreements, and this becomes the start of the data contract.

Diagram showing the data consumers providing requirements, value, ROI - i.e. the incentives, and data producers taking on responsibility for data quality, change management, and SLOs. All of this is captured in the data contract.

At this stage the data contract is not much more that structured documentation, and as many of us will know, documentation tends to lose its value quickly as the data evolves and changes.

That’s why it’s important to make the data contract useful beyond just documentation. The more useful it is, the more likely it will stay updated.

That’s what we’ll be talking about next week, as we use this data contract to power a self-service data platform.


Removing Constraints: Enabling Enterprise Data Flow by James Grafton (LinkedIn)

Nice article on focussing on the flow of data and removing bottlenecks. Although as I commented (and James agreed) if that’s the only goal it could lead to some interesting decisions.

For example, the easiest way to get data out of a system is to extract it from its database, using CDC or similar. But that creates tight coupling to the database and its schema and the resulting data is unreliable.

If we look at other areas like software engineering and platform engineering, while they want also want to remove bottlenecks they still maintain strong interfaces to ensure systems can work together reliably and be maintainable over the long term.

The 13 software engineering laws by Anton Zaides

A nice reminder of common software engineering “laws”.


Being punny 😅

I’ve just watched a program about beavers. It was the best dam documentary I’ve ever seen.


Upcoming workshops


Thanks! If you’d like to support my work…

Thanks for reading this weeks newsletter — always appreciated!

If you’d like to support my work consider buying my book, Driving Data Quality with Data Contracts, or if you have it already please leave a review on Amazon.

Enjoy your weekend.

Andrew


Want great, practical advice on implementing data mesh, data products and data contracts?

In my weekly newsletter I share with you an original post and links to what's new and cool in the world of data mesh, data products, and data contracts.

I also include a little pun, because why not? 😅

(Don’t worry—I hate spam, too, and I’ll NEVER share your email address with anyone!)


Andrew Jones
Author
Andrew Jones
I build data platforms that reduce risk and drive revenue.