Skip to main content

The problem with defaults

·2 mins

Let’s say 20 years ago you ran your code in an environment you configured simply as python. The obvious default would have been Python 2. But today the obvious default is Python 3. If you deployed that same code with the same configuration, what Python environment would you expect? What would you expect that to be in 20 years time?

What about when defining whether a field or a dataset contains personal data. What would be the best default? Most data is not personal data, so would you default to the most common definition of no and save people typing that each time? But then it’s not obvious whether someone has made a decision on that or not. You may not even know a decision needs to be made, and just be unaware that definition exists. That could easily lead to personal data not being defined as such, increasing the risk of a privacy incident.

You’re making a trade off. You’re trying to make this easier for the person configuring that environment or categorising data, but by being ambitious you’re increasing the risk of something going wrong.

And in these examples, how often are people configuring that environment, or creating new data products? Probably not that often. So how much time are you saving? Probably very little. And almost certainly not enough to increase that risk.


Want great, practical advice on implementing data mesh, data products and data contracts?

In my weekly newsletter I share with you an original post and links to what's new and cool in the world of data mesh, data products, and data contracts.

I also include a little pun, because why not? 😅

(Don’t worry—I hate spam, too, and I’ll NEVER share your email address with anyone!)


Andrew Jones
Author
Andrew Jones
I build data platforms that reduce risk and drive revenue.