Skip to main content

Who is the data catalog for?

Many organisations have a data catalog as a part of their data stack. But as we move towards a more decentralised model with data mesh and data contracts it’s worth revisiting who the data catalog is for.

Many existing data catalogs focus on indexing all your data warehouses, data lakes, and so on and presenting that data through a single web interface.

They tend to be bought by the data team, and an index of everything might be useful if you’re a data engineer and want to decommission things, or you’re thinking about data governance and what more control over the disparate data systems.

But it’s not useful if you’re trying to make data products accessible and easy to find, encouraging the greater use and greater sharing of good quality data, which is the point of a data mesh-like architecture.

For those users we need something different.

I attended the Data Mesh Learning Meetup earlier this week on Data Catalogs vs Product Catalogs, as part of which there were breakout discussions in small groups to discuss what is a data product and how a product catalog might be different to a data catalog.

I hadn’t come across the term Product Catalog before, but it sounds a lot like what I was thinking when I wrote about a contract-driven data catalog recently, where it’s a more curated catalog of data products that tells a user everything they need to know to use the organisations data.

So, maybe it is worth defining Data Product Catalogs as a new category to distinguish between the current index-based data catalogs and something that supports a more decentralised architecture and the users of that data.

Daily data contracts tips

Get tips like this in your inbox, every day!

Give me a minute or two a day and I’ll show you how to transform your organisations data with data contracts.

    (Don’t worry—I hate spam, too, and I’ll NEVER share your email address with anyone!)

    Andrew Jones
    Author
    Andrew Jones
    I build data platforms that reduce risk and drive revenue. Guaranteed, with data contracts.