Data teams evolve to focus on dependability
Yesterday I wrote how data teams often start with accessibility, which helps you get started, but leads to a number of problems:
- The terms used in the upstream database may mean something to that service, but not to the business (
customer
vscompany
vsorganisation
vsid
…), so the data needs refining - Only data that has been refined by data engineering can be used. The rest of it is potentially valuable for other use cases but remains inaccessible
- Building on top of the DB is unstable, leading to data incidents. Users may then start to lose trust in the data
These problems are exacerbated by the changing data requirements of your business, which is now moving from reporting and analytics to using data as a competitive advantage. That could include using data to drive key business processes or powering revenue generating product features features (including AI-based features).
To meet those requirements you need to evolve your focus onto dependability.
That means applying more discipline to the creation and management of data, including the change management of that data.
That means changing how the data is generated at the source, as if the source data if not dependable, the downstream cannot be.
That means using different tools to move the data around, moving away from building on top of upstream databases and instead having data provided through interfaces.
Data contracts support that evolution.