Skip to main content

Patterns for publishing events

·3 mins

Happy Friday!

This week is the start of a mini-series on publishing events.

Also links on unlearning data architecture, a simple data governance framework, and AI for data engineers.

Finally, on Wednesday (20th August) I’ll be live on Loosely Coupled comparing Data Mesh and Application Integration with Karol Skrzymowski and Rachel Barton. Should be fun - join us on LinkedIn or YouTube!


Patterns for publishing events

With data contracts we want to move to a model where data and events are published to consumers, rather than replicated from databases. That requires adopting new patterns and providing the tooling that enables them.

Over the next few weeks I’m going to share three patterns to publish events from services, starting today with the most simple, which is simply publishing events to an event broker.

This is really as simple as it sounds. When your service writes to a database, also send an event to the event broker.

A diagram illustrating a data workflow: a service writes data to a database (labeled as step 1), then sends data to an event broker (step 2), which forwards the data to a data warehouse; handwritten labels and arrows show the sequence and flow of data between components.

We use an event broker, such as Kafka, Google Cloud Pub/Sub, Amazon SNS, etc, because writing to it is easy and quick. We then pull events from the broker to populate our data warehouse.

While simple, performant, and easy to implement, this pattern has one major drawback, and that’s known as the dual-write problem.

This happens when you have a system that needs to update two different places, in our case the database and the event broker. Logically, that should be one operation, but in this case it will be two operations, as shown below. If one fails, such as the write to the event broker, then data in those two systems will be inconsistent.

A diagram illustrating a messaging workflow with a user, service, database, and event broker, showing the user taking action, the service inserting data into the database, publishing an event for downstream consumption, and an event being rejected or ending with a red "X".

Because there’s no single atomic transaction covering both systems, one write can succeed while the other fails, leaving the systems out of sync and our data inconsistent.

That data consistency might be ok for your use case, and you would rather have slightly inconsistent data than introduce more complexity and potential performance issues into your architecture.

However, it might be that you need more consistent data than you can achieve with this pattern, and for that we need to use a more complex pattern.

I’ll be discussing two of these patterns over the next two weeks, starting with the outbox pattern next week.


Unlearning Data Architecture: 10 Myths Worth Killing Bernd Wessely (or via Freedium)

It’s always good to challenge assumptions, and this article is a great list of things we should reconsider.

Why do we Need a Simple Data Governance Framework? by Nicola Askham

Data governance doesn’t need to be complicated. KISS.

AI for data engineers with Simon Willison (podcast)

Good discussion on many topics, including LLMs for data extraction, MCP security, and Postgres permissions.


Being punny 😅

I accidentally drank a bottle of invisible ink. Now I’m in hospital, waiting to be seen


Thanks! If you’d like to support my work…

Thanks for reading this weeks newsletter — always appreciated!

If you’d like to support my work consider buying my book, Driving Data Quality with Data Contracts, or if you have it already please leave a review on Amazon.

Enjoy your weekend.

Andrew


Want great, practical advice on implementing data mesh, data products and data contracts?

In my weekly newsletter I share with you an original post and links to what's new and cool in the world of data mesh, data products, and data contracts.

I also include a little pun, because why not? 😅

(Don’t worry—I hate spam, too, and I’ll NEVER share your email address with anyone!)


Andrew Jones
Author
Andrew Jones
I build data platforms that reduce risk and drive revenue.