What is a data platform?
A data platform is one of those terms that can have a different meaning for different people.
When I talk about data platforms I’m talking about a platform that provides capabilities to allow others to autonomously create, manage and consume data.
Those capabilities may include:
- Ingestion
- Transformation
- Orchestration
- Data management (backups, data retention, access controls)
And so on.
A data platform is often built and maintained by a data platform team with a mix of data engineering and platform engineering skills, who treat these capabilities as products they provide to internal customers (and as such they may have a product manager, too).
The platform will typically serve Data Engineering and ML Engineering users. Though what I’ve found is that if you build the platform well enough it can also serve many Software Engineering use cases too, since they also have a need to move data around between services and have many of the same requirements data teams have (i.e. dependable data that meets their requirements).
There are different ways to build a data platform. Tomorrow I’ll write about how to build a self-serve data platform and the benefits of doing so.