Talking about data contracts on the Catalog & Cocktails podcast
I had a really great chat last week with Juan and Tim on their Catalog & Cocktails podcast. They were both so interested in the topic and asked really great questions, and I think we did their “no BS” tagline justice!
I was particularly pleased with this comment about the podcast on LinkedIn from Daniel Taylor:
This was a really interesting interview. I’ll admit my own understanding of ‘data contracts’ was ill defined, but hearing Andrew’s description of it, the purpose and process, was illuminating.
Rather than thinking of it as a type of document, the essential part of it is really the process involved in putting it all together. It’s easy for us to fall into the sense of powerlessness over the source of our data. This approach seems particularly well suited to encouraging data engineering teams to not just ‘shift left’ within their own scope, but also engage with software engineers and development teams.
It really stood out to me that getting engagement at that level was based on the devs feeling the same pain and uncertainty about data quality that we as data engineers deal with every day, and then addressing that uncertainty with the sort of document they’re used to reading and writing, in their own ecosystem.
That’s exactly right! Data contracts are less abut the document and tooling, and more about how we use them to engage software engineers and apply more discipline to how data is generated.
That’s why this is what I focussed on most in my book. Not on the implementation (that’s just 1 chapter, but it’s all it needed), not on CI checks, but on using data contracts to change the way data is generated, managed and consumed.