Skip to main content

Data contracts on streaming data

·1 min

Data contracts are typically used to apply change management to data in tables, in a data warehouse.

But the concepts of data contracts can be applied to any interface where data is made available. For example, we use it for streaming data in Google Pub/Sub, and it could equally be used for streaming data through Kafka or any streaming platform.

These streams are configured in much the same way as a table.

First, we apply a schema to the stream that is taken from the data contract. For Pub/Sub and Kafka that likely means converting the data contract to Avro or Protobuf, but that should be trivial.

Then we apply some change management to that contract, so the schema can only be changed if the change is compatible (non-breaking). If not, we prevent that change happening until a new major version is created.

And that’s really it, for a minimal data contract implementation.

No more difficult than an implementation on a data warehouse.


Want great, practical advice on implementing data mesh, data products and data contracts?

In my weekly newsletter I share with you an original post and links to what's new and cool in the world of data mesh, data products, and data contracts.

I also include a little pun, because why not? 😅

Enter your best email here:

    (Don’t worry—I hate spam, too, and I’ll NEVER share your email address with anyone!)

    Andrew Jones
    Author
    Andrew Jones
    I build data platforms that reduce risk and drive revenue.