Skip to main content

The right level of abstraction

How do you find the sweet spot?
·3 mins

Hello 👋

This week I write about finding the right level of abstraction.

There’s also links to articles on the outbox pattern at scale, first 90 days as a CDO, and measuring latency.


The right level of abstraction

When building a (data) platform you end up thinking a lot about the abstractions you are providing, and the trade-offs they cause.

On one hand, you want to abstract away some details of the underlying platform to reduce the cognitive load of your users.

On the other hand, abstract away too much and your users will not have the knowledge to manage their services with autonomy. Instead, they will need to ask you about everything, and you become a bottleneck.

The solution we landed on when implementing data contracts at GoCardless was to provide an abstraction that made it easy to define a data contract, which then configured the resources needed to provide an interface for the data and the tooling needed to manage the data (our contract-driven data platform).

Whilst at the same time, ensuring there were as few layers as possible below that abstraction before you were using the standard Google Cloud resources. So the data contract directly configured the BigQuery table, the Pub/Sub topic, etc, with the only layer in between being our existing infrastructure as code platform.

The data contract worked because it abstracted enough complexity, whilst leaving the actual decisions to the people who owned the data.

That’s the sweet spot.


Lessons from using the outbox pattern at scale by Sugat Mahanti (Zapier)

Most of us wont get to this scale, but it’s an interesting write up.

The outbox pattern is a great pattern for sending events from upstream services to other consumers, including data warehouses. Unlike CDC, you’re not replicating the database, you’re sending events that can be designed for consumption and have change management applied to it.

It’s a common pattern to use with data contracts, and it’s what we used in the implementation I wrote about in today’s newsletter.

The First 90 Days as a CDO: Forget the Data. Read the Room by Julia Bardmesser (on LinkedIn)

I thought I’d spend them assessing the data landscape, building a strategy, and identifying quick wins.

What I actually spent them doing was figuring out who trusted what, who was threatened by whom, and why the last three data initiatives had quietly died.

Measuring Latency in Data Platforms by Rodrigo Molina

Practical ideas for measuring latency, and why it’s important to do so.


Being punny 😅

I’ve got some racing geese for sale. Let me know if you want to take a quick gander.

(I’ve been playing Untitled Goose Game with my 6 year old recently, he loves it!)


Thanks! If you’d like to support my work…

Thanks for reading this weeks newsletter — always appreciated!

If you’d like to support my work consider buying my book, Driving Data Quality with Data Contracts, or if you have it already please leave a review on Amazon.

🆕 I’ll be running my in-person workshop, Implementing a Data Mesh with Data Contracts, in June in Belgium. It will likely be only in-person workshop this year. Do join us!

Enjoy your weekend.

Andrew


Want great, practical advice on implementing data mesh, data products and data contracts?

In my weekly newsletter I share with you an original post and links to what's new and cool in the world of data mesh, data products, and data contracts.

I also include a little pun, because why not? 😅

    Newsletter

    (Don’t worry—I hate spam, too, and I’ll NEVER share your email address with anyone!)


    Andrew Jones
    Author
    Andrew Jones
    I build data platforms that reduce risk and drive revenue.