The language of data engineering
The language we use is important. It helps define culture, it shapes behaviours, and it defines our identity.
That’s why I find the language we use in data engineering interesting.
For example, for the last few years we’ve had this expression that data is the new oil. That expression has then seeped into how we build with data, as we talk about how data needs to be refined before it can be useful.
That in turn influences how we view the role of data engineering. We see ourselves as refiners, taking a dirty raw product and producing something that can then be used in many different applications (like a plastic).
However, refining data is difficult, it’s expensive, and it becomes a bottleneck. You can’t refine all data, so you’re limiting the amount of useful data.
So, why do we need to refine data?
Well, we don’t.
We can create a cleaner raw product at the source, reducing or even sometimes eliminating the refining needed. And doing so reduces overall costs to an organisation and improves the outcome.
But to do so, we need to change our identify, change our behaviours, and change our culture.
We also need to change our language.
P.S. This post was inspired by a (perfectly reasonable and valid!) question I got during my presentation for Data Mesh Learning yesterday. If you missed it you can watch it anytime on YouTube.