A multi-million dollar data quality issue
Someone reached out to me and asked me to present on data contracts to their organisation after they had a data quality issue that directly resulted in multi-million dollars of lost revenue.
I can’t go into specifics, but it went like this.
One of their companies product features was billed based on usage.
It was the data teams responsibility to prepare that data for billing.
That usage information wasn’t directly collected by the product information team for billing. But they did have some instrumentation on the product, and with a bit of massaging that could be used to accurately track the usage of the feature by customer.
So the data team used that. And for a while it seemed to be working fine.
Now, this product feature had different tiers, and you paid more depending on the tier you chose to use.
A new, higher tier was added by the product team.
But, the product team forgot to add the instrumentation to it, because they didn’t really use the instrumentation data themselves.
And they didn’t realise anyone else was using that data.
It was a couple of months before someone realised that no customers were being billed for the highest tier of usage.
It took some investigation, but they found the problem. The data was not being collected for that tier, and there was no usage data.
And since there was no data available for those couple of months, they had no way of billing their customers for the usage. That revenue was lost.
So, what was the root cause of this problem?
An important business process was relying on data not built for it.
The data they were using was a byproduct of the product engineering teams instrumentation. When they didn’t need it for that, they forgot all about it.
And that’s why they were interested in data contracts.
My goal with data contracts has always been to facilitate the more explicit generation of data - data that meets the requirements of those who depend on it.
This story shows why that’s so important.