Skip to main content

Past performance is no guarantee of future performance

·1 min

If I want to use a dataset to build a critical data application, I need to make a decision on whether it meets my requirements.

I could do that using the past performance of that dataset. For example, if I can see that it has been updated hourly for the last 30 days, I could assume it will always be updated hourly, and build that assumption into my application design.

However, past performance is no guarantee of future performance.

The dataset may have been updated hourly over the last 30 days, but if the data producers aren’t committing to providing that level of support, there’s no SLOs, no on-call, etc, then at some point that performance will reduce.

And when it does it can have a significant impact on my data application and my users, because I had assumed the performance is better than it really was.

That’s why you need the performance to be explicitly defined, in a data contract, and for the data consumer to accept responsibility for that performance.

Only then can you build on the dataset with confidence.


Want great, practical advice on implementing data mesh, data products and data contracts?

In my weekly newsletter I share with you an original post and links to what's new and cool in the world of data mesh, data products, and data contracts.

I also include a little pun, because why not? 😅

Enter your best email here:

    (Don’t worry—I hate spam, too, and I’ll NEVER share your email address with anyone!)

    Andrew Jones
    Author
    Andrew Jones
    I build data platforms that reduce risk and drive revenue.