Bridging the gap between data producers and consumers
Hey, hope your week has been good :)
In today’s newsletter I talk about bridging the gap between data producers and consumers.
There’s also links to posts on why real-time analytics is so hard, why product teams should own data, and core principles for data integration.
Bridging the gap between data producers and consumers
If you’ve been in data for any length of time, you will be aware that there is a gap between those who produce the data, and you as the consumers.
This is because of the organisational gap between yourselves and the data producers.
For example, much of a businesses most important data comes from software engineering teams and the services they own. But data teams are often in a separate part of the organisation to software engineering, and the manager in common could be as high as the CEO.
Because of that, there is little communication between the two groups.
So, how do we bridge that gap between the data producers and us as data consumers?
We do so by formalising the relationship, and being explicit about the expectations on both sides of the relationship.
A good example of this formalisation can be seen with APIs, and how they formalise the relationship between different companies.
If I had a company and wanted to take payments, I could integrate with Stripe through their API. I wont ever need to speak with someone at Stripe, but I can build on their API with confidence because it’s well defined, clearly sets expectations, and has been designed explicitly for people like me to consume.
Within an organisation, APIs are used for the same reason. They formalise the relationship between software teams and the services they are building, setting those same expectations.
So, to formalise the relationship between software engineering and data teams, we need something similar to an API to act as the interface through which we communicate in the absence of close collaboration.
That interface is the data contract.

I’ll be talking more about bridging the gaps between data producers and data consumers with Jack Vanlightly and Shruthi Panicker from Confluent on July 16th.
There will also be a demo showing you exactly how to implement these interfaces.
Interesting links
Why real-time analytics is hard? by Jon Su
Good overview of the challenges of streaming and what you need to have in place to get real-time analytics working
Kill Your Data Team: Why Product Teams Should Own Data by Sven Balnojan
Confrontational title, but a good summary of the limits of centralised data teams, and interesting case studies.
Guidelines for Data Integration Processes by Roelant Vos
9 core principles for implementing data integrations/ETL.
Being punny 😅
I can cut a piece of wood in half just by looking at it. It’s true, I saw it with my own eyes.
Thanks! If you’d like to support my work…
Thanks for reading this weeks newsletter — always appreciated!
I put a lot of work into writing this every week. One thing that would really help me out is more Amazon reviews for my book. So, if you own it, even if you bought it elsewhere, please leave a review on Amazon 🙏
Thanks! Andrew