The costs of storing data

Since Hadoop came along in 2006 and significantly reduced the cost of storing “big data” we’ve often been focused on how much data we can bring in centrally, with the assumption that we’ll use it to create value later.

But by prioritising quantity over quality, many organisations found it took so much effort to use this data that in practice they just couldn’t justify it. And so, much of the data we spent so much time bringing centrally was left unused. It’s dark data, rotting in a data swamp.

Even though storage was (and still is) relatively cheap, it’s not free.

Worse, there’s also an increase in risk associated with data that is stored but unused. It tends to go unmanaged, forgotten about, and that increases the risk of data misuse and data leakages - particularly if there is personal or sensitive data involved.

And that can lead to serious costs for the organisation.