Skip to main content

The costs of storing data

·1 min

Since Hadoop came along in 2006 and significantly reduced the cost of storing “big data” we’ve often been focused on how much data we can bring in centrally, with the assumption that we’ll use it to create value later.

But by prioritising quantity over quality, many organisations found it took so much effort to use this data that in practice they just couldn’t justify it. And so, much of the data we spent so much time bringing centrally was left unused. It’s dark data, rotting in a data swamp.

Even though storage was (and still is) relatively cheap, it’s not free.

Worse, there’s also an increase in risk associated with data that is stored but unused. It tends to go unmanaged, forgotten about, and that increases the risk of data misuse and data leakages - particularly if there is personal or sensitive data involved.

And that can lead to serious costs for the organisation.


Want great, practical advice on implementing data mesh, data products and data contracts?

In my weekly newsletter I share with you an original post and links to what's new and cool in the world of data mesh, data products, and data contracts.

I also include a little pun, because why not? 😅

(Don’t worry—I hate spam, too, and I’ll NEVER share your email address with anyone!)


Andrew Jones
Author
Andrew Jones
I build data platforms that reduce risk and drive revenue.