Source systems must provide applicable data
One of our aims with data contracts was to move away from our existing Change Data Capture (CDC) architecture, where the entire database is synced to the data warehouse with exactly the same structure.
This was needed if we were to meet the following goals:
- Change management: Software engineers should be able to change their database without impacting downstream consumers
- Data applicability: Have the source systems provide data that is immediately applicable
I’ve talked a lot about change management on this newsletter, but the second one is also important.
With data contracts, the data produced by the source system no longer needs to look anything like the source database.
So now the data can be modelled so it:
- Meets the requirements of the consumers
- Is immediately useful, without joining or other transformation
- Represents data using common business language, not service-specific language
That reduces the time to insights, reduces the transformation costs, and make the data useful to a much wider audience, who no longer need to learn about the service just to use the data.
With data contracts, our source systems provide applicable data.