📙 Looking for more on data contracts? Check out my book! 📙 2024
Data contracts set the expectations for the data.
These include:
- How to access the data
- How the data will be structured
- How often the data will be published
- Who to contact about the data (the owner)
- What will happen when the data needs to evolve
Without expectations, users make assumptions that are more optimistic than reality.
As I wrote yesterday, many data professionals don’t trust the data they are building on. And many users of data and data applications don’t trust the data they’re being provided.
2023
Mostly when I talk about data contracts I’m talking about applying them to data generated within our organisations. That’s usually the most valuable data we have. It’s also the data we have most control over its generation.
APIs and data contracts have a lot in common, and APIs were part of the inspiration behind data contracts when I was coming up with the idea a few years back. The both provide the interface (see my post from a couple of weeks ago on the importance of interfaces), they both set expectations for the user (the structure, semantics, SLOs, and so on), and they both allow for integrations with other services, tools, etc.
In yesterday’s note I wrote about the problem with defaults. One response to my personal data example could be “why don’t we just infer it?”.
How do you like your data?
Do you want it to be agile? So it can change at any moment, depending on the needs or wants of those producing the data? If a team decides it wants to model an object differently, with different IDs, they can do so. They are moving fast and breaking things.
Data governance is changing.
With the move towards data mesh we’re seeing a move away from centralised governance teams defining policies and assigning roles like “data steward” (which no one wanted to be!) to people in the organisation and instead delegating those responsibilities to those teams and domains who produce the data and have the most context on the data.
I’m writing this on the train back from Paris, having spent the last few days at the apidays conference, where I gave a talk titled “Data contracts: The API for Data”. I have a lot of digest and plenty of notes to go through! But one think that struck me is the similarities between software and data.