Skip to main content

The problem with defaults

·2 mins

Let’s say 20 years ago you ran your code in an environment you configured simply as python. The obvious default would have been Python 2. But today the obvious default is Python 3. If you deployed that same code with the same configuration, what Python environment would you expect? What would you expect that to be in 20 years time?

What about when defining whether a field or a dataset contains personal data. What would be the best default? Most data is not personal data, so would you default to the most common definition of no and save people typing that each time? But then it’s not obvious whether someone has made a decision on that or not. You may not even know a decision needs to be made, and just be unaware that definition exists. That could easily lead to personal data not being defined as such, increasing the risk of a privacy incident.

You’re making a trade off. You’re trying to make this easier for the person configuring that environment or categorising data, but by being ambitious you’re increasing the risk of something going wrong.

And in these examples, how often are people configuring that environment, or creating new data products? Probably not that often. So how much time are you saving? Probably very little. And almost certainly not enough to increase that risk.

Daily data contracts tips

Get tips like this in your inbox, every day!

Give me a minute or two a day and I’ll show you how to transform your organisations data with data contracts.

    (Don’t worry—I hate spam, too, and I’ll NEVER share your email address with anyone!)

    Andrew Jones
    Author
    Andrew Jones
    I build data platforms that reduce risk and drive revenue. Guaranteed, with data contracts.