DDL: Data Mesh - Lessons from the Field

DataHub
14 Mar 202447:09

Summary

TLDR在本集DDL节目中,AutoTrader的工程总监Darren Hacken与主持人、ACIL联合创始人兼CTO Shashanka讨论了数据领域的演变和数据网(Data Mesh)的概念。Darren分享了他个人的职业经历,以及AutoTrader如何通过分散数据团队来提高数据处理能力。他们还探讨了数据网的实施,包括如何通过数据产品和元数据管理来实现更好的数据治理和可观察性。Darren对数据网的未来充满期待,认为它将帮助组织以更分散的方式构建和利用数据产品。

Takeaways

  • 🎉 Darren Hacken 是 AutoTrader 的工程总监,负责平台和数据,该公司是英国最大的汽车平台。
  • 🚀 Darren 初期对数据工作不感兴趣,但随着大数据技术的兴起,他对数据领域产生了热情。
  • 🌐 AutoTrader 的数据团队设置相对分散,有多个平台团队和专注于特定问题领域的数据团队。
  • 🔄 数据团队的演变从集中式到分散式,反映了随着组织规模的扩大,对数据管理方式的适应。
  • 🤖 数据网格(Data Mesh)是一种社会技术概念,强调了文化和团队结构的重要性,以及如何实现去中心化。
  • 🛠️ 实施数据网格的过程中,AutoTrader 遇到了技术工具集中化与去中心化需求之间的差距。
  • 📊 通过 Kubernetes 和 Data Hub,AutoTrader 正在构建数据产品的思维和实践,以提高数据的可发现性和治理。
  • 🔧 数据网格的实施带来了对数据产品命名和数据建模实践的新挑战。
  • 🌟 Darren 认为数据产品的概念是数据网格中最有力的部分,它有助于更好地组织和利用数据。
  • 🚫 数据网格的实施并非一蹴而就,需要时间和持续的技术进步来克服现有的挑战。
  • 🔮 未来,Darren 期待数据网格和数据产品能够进一步推动组织内部的数据使用和创新,特别是在 AI 和 ML 领域。

Q & A

  • Darren Hacken目前担任什么职位?

    -Darren Hacken目前担任AutoTrader公司的工程总监,负责平台和数据方面的工作。

  • AutoTrader公司主要业务是什么?

    -AutoTrader公司是一个汽车市场和科技平台,主要业务是作为英国最大的汽车平台,涉及买卖汽车等相关服务。

  • Darren Hacken对于数据领域有哪些看法?

    -Darren Hacken非常关注数据领域,他认为数据是非常重要的,可以塑造和改变组织,并且随着AI和ML等技术的发展,数据领域一直在成长。

  • Darren Hacken的职业经历中有哪些转变?

    -Darren Hacken在职业生涯初期并不喜欢数据相关工作,因为他不喜欢基于ETL工具的重复性工作。但随着大数据技术的兴起,他发现数据领域变得非常吸引人,最终成为了他热爱的领域。

  • Darren Hacken提到的数据产品是什么?

    -数据产品是指将数据和相关功能捆绑在一起的产品,它可以帮助组织更有效地管理和使用数据,支持数据的发现、分析和治理。

  • AutoTrader公司的数据团队是如何运作的?

    -AutoTrader公司的数据团队是分散式的,有多个平台团队和数据团队,他们专注于不同的业务领域,如广告、用户行为、车辆定价等,并致力于构建数据产品和提供自助分析服务。

  • Darren Hacken如何看待数据治理和元数据管理?

    -Darren Hacken认为数据治理和元数据管理是实现数据分散化后的关键需求,特别是在数据产品之间建立清晰的所有权和依赖关系,以及确保数据的质量和安全性。

  • Darren Hacken提到了哪些技术在数据领域的应用?

    -Darren Hacken提到了DBT、Kubernetes、Cuberes、数据Hub等技术在数据领域的应用,这些技术帮助他们实现了数据产品的创建、管理和治理。

  • Darren Hacken对于数据领域的未来有哪些期待?

    -Darren Hacken期待数据产品的概念能够更加深入人心,同时他也希望看到更多支持数据分散化的技术出现,使得数据管理和治理变得更加容易。

  • Darren Hacken如何看待数据领域的挑战?

    -Darren Hacken认为数据领域的挑战在于如何保持数据质量和实践的高标准,以及如何在没有中央团队的情况下维持这些标准。此外,数据命名和建模也是持续存在的挑战。

  • Darren Hacken对于数据合同有何看法?

    -Darren Hacken认为数据合同是一个有趣的领域,他们目前更多地隐含地使用数据合同,通过标准化的方法和验证器来检测模式变化,并对未来数据合同的发展持开放态度。

Outlines

00:00

🎤 开场与介绍

本段介绍了视频节目的开场,主持人表达了对讨论话题的兴奋之情,并欢迎嘉宾Darren Hacken加入节目。Darren是AutoTrader的工程总监,负责平台和数据。主持人Shashanka是acil的联合创始人和CTO,也是数据Hub项目的创始人。Darren分享了他与数据结缘的经历,以及他如何从不喜欢数据工作转变为对数据充满热情。

05:01

🔍 数据团队的结构与运作

Darren描述了AutoTrader的数据团队结构,包括平台团队和专注于特定领域的数据团队。他强调了数据团队的去中心化,以及如何通过构建数据平台来支持组织中的数据能力。他还提到了数据团队与其他团队的互动,以及如何围绕问题组织团队。

10:03

🌐 数据网格的理解和实践

Darren分享了他对数据网格的理解,将其视为一种社会技术实践和文化转变。他提到了数据网格的起源和它如何帮助组织实现去中心化。Darren讨论了他们如何开始应用数据网格原则,特别是在技术架构上从集中式模型转变为更加分散的数据产品。

15:04

🛠️ 数据产品的治理与挑战

Darren讨论了在实施数据网格过程中遇到的挑战,特别是在数据治理、元数据管理和可观察性方面。他提到了技术工具在支持去中心化方面的不足,并分享了他们如何使用元数据和数据Hub来解决这些问题。

20:05

🔄 数据产品的创建与管理

Darren解释了他们如何通过使用Kubernetes作为控制平面来创建和管理数据产品。他讨论了如何通过自动化和代码化的方式来处理数据产品的元数据,并分享了他们如何使用数据Hub来收集和连接数据产品。

25:07

🤔 数据网格的挑战与未来

Darren探讨了数据网格在组织中可能带来的架构压力,以及如何在没有中央团队的情况下保持数据实践的质量。他还提到了数据命名和建模的挑战,以及他们如何使用数据合同来隐含地处理这些问题。

30:09

🚀 数据网格的未来展望

Darren对未来的数据网格和数据产品表示兴奋。他预见了数据产品思维将如何帮助组织更好地利用数据,以及数据网格如何帮助缩短产品上市时间并提高市场响应速度。他还提到了AI和数据产品如何相互促进,并对未来的技术发展表示乐观。

35:11

🙌 结语与感谢

节目的最后,主持人Shashanka感谢Darren的参与和分享,并对未来的合作表示期待。他们讨论了数据产品和数据网格的未来,以及如何通过社区和开源项目来推动这些概念的发展。

Mindmap

Keywords

💡数据网格(Data Mesh)

数据网格是一种数据架构模式,旨在通过分散数据所有权和责任来促进组织内的数据管理。在视频中,Darren提到数据网格是关于如何实现去中心化的方法和原则,以及如何通过数据产品思维来提高数据的利用和治理。

💡数据产品(Data Products)

数据产品是指将数据和相关功能打包在一起,以便用户可以轻松访问和使用的数据集合。在视频中,Darren强调了数据产品的重要性,它们如何帮助组织更好地理解和利用其数据资源。

💡元数据(Metadata)

元数据是关于数据的数据,它提供了有关数据内容、来源、格式和结构的信息。在视频中,Darren讨论了元数据在实现数据网格和数据产品中的关键作用,特别是在数据治理和可观察性方面。

💡数据治理(Data Governance)

数据治理是一套流程、政策和标准,旨在确保数据的质量、可用性和一致性。在视频中,Darren讨论了在实施数据网格过程中,如何通过元数据和数据产品来改善数据治理。

💡数据平台(Data Platform)

数据平台是指支持数据存储、处理、分析和可视化的技术和工具集合。在视频中,Darren作为AutoTrader的数据平台负责人,分享了他们如何构建和维护支持数据网格的数据平台。

💡数据团队(Data Teams)

数据团队是指专注于数据相关任务的专业人员集合,包括数据工程师、数据科学家和分析师等。在视频中,Darren讨论了AutoTrader的数据团队如何围绕业务领域和问题进行组织,并如何通过数据网格模式进行工作。

💡数据所有权(Data Ownership)

数据所有权是指对数据资产的管理和控制权。在数据网格架构中,数据所有权通常是分散的,每个团队或部门负责管理与其业务领域相关的数据。

💡数据质量(Data Quality)

数据质量是指数据的准确性、完整性和一致性。在视频中,Darren强调了在数据网格模式下,如何通过数据产品和元数据来提高数据质量,以及如何通过技术手段来确保数据的可靠性。

💡数据发现(Data Discovery)

数据发现是指在大量数据中找到相关和有价值的信息的过程。在数据网格架构中,数据发现尤为重要,因为它有助于用户理解和利用分散的数据资源。

💡数据合同(Data Contracts)

数据合同是指在数据提供者和消费者之间就数据格式、结构和使用规则达成的协议。在视频中,Darren提到了数据合同在确保数据产品之间兼容性和一致性方面的重要性。

💡数据架构(Data Architecture)

数据架构是指组织数据的蓝图,包括数据的存储、处理、管理和使用方式。在视频中,Darren讨论了数据网格如何作为一种数据架构模式,帮助组织实现数据的去中心化管理。

Highlights

Darren分享了自己对数据领域的热情以及其在AutoTrader的角色和职责。

Darren讲述了自己职业生涯的转变,从最初不喜欢数据工作到成为数据领域的领导者。

AutoTrader的数据团队结构是分散式的,有专门针对不同领域如广告和用户行为的数据团队。

Darren解释了数据产品的概念以及如何通过数据产品实现团队间的协作和数据共享。

AutoTrader在数据平台建设上面临的挑战,特别是在技术分界和数据治理方面。

Darren讨论了数据网格(Data Mesh)的概念以及它如何帮助组织实现数据的去中心化。

Darren分享了AutoTrader实施数据网格的经验,包括技术挑战和文化变革。

讨论了数据治理、元数据管理和可观察性在数据网格实施中的重要性。

Darren提到了使用Kubernetes作为数据产品的控制平面,并如何通过自动化提高效率。

讨论了数据网格的未来,以及它如何影响组织内部的数据使用和产品开发。

Darren对于数据产品和数据合同在数据网格中的作用和未来发展的展望。

讨论了数据网格的挑战,包括如何保持数据质量和实践中的困难。

Darren分享了对于数据网格概念未来的看法,以及它如何适应不断变化的技术环境。

讨论了数据网格如何帮助组织更好地利用数据,并提高决策的速度和质量。

Darren对于数据网格和数据产品的未来发展表示乐观,并期待技术的进步。

Transcripts

00:18

[Music]

00:23

[Music]

00:41

[Music]

01:05

[Music]

01:08

hello everyone and welcome to episode

01:11

four of the ddl show I am so excited

01:16

that we're going to be talking about a

01:17

topic that used to be exciting and has

01:21

stopped being exciting and that itself

01:23

is exciting so I'm super excited to

01:25

bring on Darren hacken uh I think our

01:28

first conversation Darren was literally

01:32

on the data mesh learning group first

01:34

time we met um and so it's it's kind of

01:37

a full circle I'm super excited to

01:40

welcome you to the show Darren is an

01:41

engineering director heading up uh

01:43

platform and data at AutoTrader and I'm

01:46

your host shashanka co-founder and CTO

01:49

at uh acil and founder of the data Hub

01:52

project so Darren tell us uh about

01:54

yourself and how you got into Data hi

01:57

shash thank you for having me today um

02:00

yeah so my name is Darren I work for a

02:02

company in the UK in the United Kingdom

02:04

called aut Trader so we're a

02:07

automotive Marketplace and Technology

02:09

platform that drives it's the UK's

02:11

largest um Automotive platform so buying

02:14

and selling cars that kind of thing and

02:17

one of the areas I deeply deeply care

02:18

about is is the data space um so here at

02:21

aut Trader I kind of look after our kind

02:24

of data platform um the capabilities

02:27

that we need in order to surface data

02:29

been working in data a long time now

02:31

maybe eight nine years um I my I Funny

02:37

Story I v I would never work in data

02:40

because when I started my career I

02:43

worked in fintech for in a in a data

02:46

team and I absolutely hated it because

02:48

it was all guwy based ETL tools and I

02:53

got out of this F as I possibly could

02:54

and said never again I love engineering

02:57

I you know I'm a coder I need to get

03:00

away and do this other thing you don't

03:03

like pointing and clicking clearly I

03:05

didn't like pointing and clicking I like

03:07

I like code um and then it kind of got

03:10

really sexy and big data and technology

03:13

changed and I think it's one of the most

03:15

exciting areas of Technology now so

03:18

never say never is probably my I always

03:20

find that a funny kind of starting point

03:22

for me in terms of data to leave a leave

03:24

a rooll and go never again and here I am

03:27

um so yeah passionate about data really

03:30

think it's one of them things that

03:32

really can shape and change

03:33

organizations it's um and it's it's

03:36

growing all the time right with things

03:37

like Ai and LMS and hype Cycles around

03:40

things like that but yeah thanks for

03:43

having me they do say data has gravity

03:45

and you know uh normally it's like

03:47

pulling other data close to it but uh

03:51

clearly people also get attracted to it

03:53

and can never leave I was literally the

03:55

same way uh well I never went to data

03:57

and I wasn't able to leave so I was um

04:01

you know an engineer on the um online

04:03

data infrastructure teams right so I was

04:05

uh doing U display ads and uh doing

04:08

real-time bidding on ads at Yahoo and

04:12

then I uh was offered the uh chance of a

04:16

lifetime to go rebuild linkedin's data

04:19

infrastructure and I didn't actually

04:21

know what data meant at that point I was

04:23

scared of databases honestly because you

04:25

know it's hard to build something that's

04:27

supposed to be a source of Truth like

04:29

wait you're responsible for actually

04:31

making sure the right actually made it

04:32

to dis and it actually got flushed and

04:34

was replicated three times so that no

04:37

one loses an update well that seems like

04:39

a hard problem so you know that was my

04:42

mission impossible that I went to

04:43

LinkedIn for and I never left I've just

04:45

been in data this whole time so can

04:48

totally relate you never escape the

04:51

gravity you do not um so well so you're

04:55

you're leading big uh teams at auto

04:58

trader right now you know platform and

05:00

data tell me a little bit about what

05:03

that team does because you know as I

05:05

have talked to so many data leaders

05:08

around the world it seems clear to me

05:10

that all data teams are similar but not

05:13

all teams are exactly the same so maybe

05:16

walk our audience through what does the

05:19

data team do and who are the surrounding

05:21

teams and how do they interact with them

05:24

yeah um so we've so interestingly aut

05:28

Trader as a or A's been around for about

05:31

40 years so they started as a magazine

05:34

you could go into your you know local

05:36

store and find the magazine and pick it

05:39

up so that's interestingly means that as

05:41

Technologies evolved throughout the

05:42

decades you know they've gone through

05:44

many chapters of of it um but today

05:48

we're relatively decentralized in terms

05:50

of our data team setup and you know

05:52

we'll get into that I guess a little bit

05:53

more when we talk about data mesh today

05:57

um but we have a kind of platform team

06:00

so we have several platform teams and we

06:02

have a platform team um predominantly

06:04

built made up of Engineers and kind of

06:06

Sr de you know folks and they build um

06:11

what we call our data platform and that

06:13

is the kind of product name I guess for

06:15

the bundling of

06:17

technology which would would help Drive

06:20

data capabilities across the

06:21

organization you know that might be

06:23

building data products which we can get

06:25

into later it could be um metadata

06:28

management how to create security

06:30

policies with data um but crucially

06:32

their play is about building

06:34

capabilities that let other people um

06:36

lose these capabilities and and build

06:38

technology and other than that we try to

06:40

keep data teams closer to um the domain

06:44

of of a of an area or a problem so we

06:47

may have data teams we focus a lot on

06:50

like advertising or user Behavior maybe

06:53

more around like vehicles and pricing

06:55

and fulfillment type problems um but we

06:58

we tend to have kind of Engineers or

07:00

Engineers that specialize in data um

07:03

scientists and analysts so they they're

07:06

kind of as a discipline together and

07:08

manage together from a craft perspective

07:10

but then in terms of how how they work

07:12

together we chend to form form them

07:14

around problems um pricing as I said

07:18

earlier and things like that and they

07:19

would maybe do analytics self- serve

07:22

analytics um product analytics machine

07:26

learning um you know feature engineering

07:30

very much that kind of thing and we're

07:31

trying to keep it as close to kind of

07:33

engineering as as possible so very much

07:35

a decentralized play or that's been our

07:38

current our current generation of people

07:41

wear and team topologies um got it got

07:44

it and by the way for the audience who's

07:48

listening in um definitely uh feel free

07:51

to ask questions we'll we'll try to pull

07:53

them up uh as they come in so you know

07:55

this is meant to be me talking to Darren

07:57

and Darren talking to me and all of you

07:59

being uh having the ability to kind of

08:01

participate in the conversation so um

08:04

definitely as we keep talking about this

08:06

topic uh keep asking questions and we'll

08:08

try to pull them up and um combine them

08:11

so Darren you talked a little bit about

08:13

how the teams were structured it

08:15

definitely resonated with kind of how uh

08:17

LinkedIn evolved over the over the years

08:20

I was there we started out uh with uh a

08:24

single data team that was uh responsible

08:27

for both platform as well as

08:30

uh business so you know they were

08:33

responsible for making decisions like

08:34

what warehousing technology to use and

08:37

how to go about it and then but also

08:39

building the executive dashboard and

08:42

building the foundational data sets we

08:45

had so many debates about whether to

08:47

call them foundational or gold but the

08:50

concept was still the same you build

08:52

kind of the the the canonical business

08:55

model on top of which you want all um

08:59

insights as well as you know analytics

09:02

as well as AI to be derived from and

09:04

then over the years we definitely had a

09:07

lot of stress with that centralization

09:10

and had to kind of split apart the

09:13

responsibilities uh we ended up going to

09:16

a model where there was essentially a

09:18

data unaware or semantics unaware team

09:21

that was fully responsible just for the

09:24

platform and um sub teams that emerged

09:28

out of those out of that original team

09:31

that sometimes got fully embedded into

09:34

product delivery teams to actually um

09:37

essentially have a local Loop where

09:39

product gets built data comes out of it

09:43

and then the whole Loop of creating

09:46

insights models and features and then

09:48

shipping it back into the product was

09:50

all owned and operated by um a specific

09:53

team so it looks like that's kind of

09:54

where you've ended up as well yeah in

09:58

fact that's spookily similar I mean we

10:00

started definitely more centralized and

10:03

then these teams sort of came out of

10:06

that more centralized model so like we

10:08

we built a team about use behavior and

10:10

advertising kind of build that that went

10:13

really well and then they felt a lot

10:15

more connected and it did evolve like

10:16

that um and and a lot of this I think

10:18

just spawns from scale really so I mean

10:22

my organization is definitely another

10:23

the figers where you were previously

10:25

working shashanka but we definitely find

10:28

that you know the more hungaryan

10:29

organization gets for data eventually

10:32

you you simply can't keep up with this

10:34

centralized team with this scarcity of

10:36

resource and everyone fighting over the

10:37

same thing gets really hard to think

10:39

about you know do I invest in the

10:41

finance team do I uh invest in our

10:44

advertising or our marketing team so

10:45

like eventually like partitioning almost

10:47

your resource in some way feels

10:50

inevitable that you have to to otherwise

10:52

it becomes it becomes so

10:55

hard cool so let's let's let's talk

10:58

about the topic of

10:59

the day what does data mesh mean to you

11:02

then now that we've kind of understood

11:03

how the teams have evolved and what your

11:06

uh teams are doing day today yeah and I

11:09

think it's a really good point that we

11:11

started around teams and culture

11:14

actually because that is really what I

11:17

think the heart of what J mesh is um so

11:21

I I used to work um For Thought Works

11:23

where shaku also kind of came up with

11:26

the the data mesh thing um kind of came

11:28

from and I I wasn't working at the time

11:31

but I remember reading it and we've

11:34

we're already on this journey of like we

11:36

need to decentralize and our platform is

11:39

really important to us and we need

11:41

capabilities and we want more people to

11:43

do that and in fact you know we were

11:46

succeeding at decentralizing and scaling

11:49

um but I think when we did that we were

11:51

entering new spaces where a lot of

11:53

people hadn't really talked about it so

11:55

for me data mesh one of the things that

11:57

it means it's a you know socio technical

12:00

thing a cultural thing it's like devops

12:02

really or something like that for me

12:04

she's done a great job describing how

12:07

to

12:09

um you know get there like data products

12:13

and all this kind of thing but one of

12:14

the great things I think that J did with

12:17

talking about dat mesh was built a

12:18

lexicon a grammar a way of us all

12:21

communicating to each other like

12:22

shashanka me and you met on a on a data

12:25

mesh you know community and immediately

12:28

we we were able to speak at a level that

12:31

we simply wouldn't have been able to

12:32

maybe if we would have met five years

12:34

ago and try to have the same

12:36

conversation um so a lot of it's that

12:38

for me that's what data mesh is it's

12:39

about it's a it's a method or an

12:41

architectural pattern or set of

12:43

principles or guidelines about how you

12:45

could achieve decentralization and and

12:48

move away from this this Central team

12:51

and kind of break apart from it um and

12:53

that has been and that has been the big

12:56

draw right to of of the concepts because

12:59

a lot of people relate to it uh and kind

13:02

of resonate with it and then that from

13:06

that um what is it Summit of Hope comes

13:09

the the valley of Despair where you you

13:13

start figuring out okay how do I

13:15

translate this idea into reality and how

13:19

much do I need to change um so walk us

13:22

through your journey of like how have

13:24

you implemented data mesh how have you

13:26

taken these principles and brought them

13:29

to life or at least attempted to bring

13:31

them to life and we'll see how you feel

13:32

about it like would you give yourself an

13:34

a grade or a b-grade we we'll we'll

13:37

figure that out later but what have you

13:38

done in in BR to life so so at the point

13:44

when we started trying to apply data

13:47

mesh um we were in this place where we

13:49

we decentralized some of our teams but

13:52

our technology underneath is still very

13:54

much centralized and shared so almost

13:56

like a monolith with teams contributing

13:59

to it but everything was partitioned or

14:03

structured around technology so we'd end

14:05

up with I don't know a DBT projects or

14:07

something right or we had a monolith

14:09

around spark jobs and things it's very

14:12

technology partitioned um and then when

14:15

we started looking at data mesh we were

14:17

really excited because one of the big

14:19

things that we took out it was this term

14:22

data product and we're like great we've

14:24

now found this this this language to

14:27

describe how we were going to try and

14:29

break things down like before that we

14:31

were trying to break break you know lots

14:33

of data down into chunks of data but we

14:36

just couldn't think of like the wording

14:37

gave us a lot more power to to start

14:39

communicating so we we started trying to

14:41

break down our DBT monolith essentially

14:45

into Data products um and that's been

14:47

one of our journeys of like breaking it

14:49

partitioning it and doing that so that

14:51

was the big starting point of doing that

14:55

um so it was very much like we had some

14:57

teams that were decentralized and then

14:59

like how do we almost catch the

15:01

technology up so DBT was the starting

15:04

point of

15:06

that so you went from a monolithic repo

15:09

where all of your transformation logic

15:12

was being

15:14

hosted to chopping it up and um

15:17

splitting it up uh across multiple

15:19

different teams um great so once you did

15:23

that what did you then

15:25

find well then you find that the tooling

15:28

and system that we've got today has some

15:31

gaps when you start to think about

15:33

decentralization like a lot of the

15:35

technologies that we use in the data

15:36

space do promote very much very Central

15:39

centralized approach um like I think

15:42

it's becoming a little bit less popular

15:43

but you know airflow it' be like one

15:45

airflow for your whole

15:46

organization EBT might say one big

15:49

projects even though they are saying

15:50

that less now but there was definitely a

15:52

period where like you know that was the

15:54

that was the popular approach so we you

15:58

broke things apart

15:59

and now you've got gaps between data

16:01

products where you've got DBT and DBT

16:05

and now you've got gaps and that's where

16:06

you really start to realize that there

16:08

are other requirements that start to

16:10

come in that you need and two big ones

16:12

that felt obvious for us were around

16:15

data governance metadata kind of knowing

16:19

more about these data products at at a

16:21

met at a meta level observability and

16:25

how you define that and also how you

16:27

start creating security policy between

16:29

them so it's the classic thing of when

16:31

organizations move to microservices like

16:34

all of a sudden like monitoring between

16:36

things things breaking in you know in

16:38

the infrastructure level between the the

16:40

network protocol starts to

16:42

happen I think the data world is not

16:46

there and is catching up and I think it

16:49

will one day but today they were some of

16:51

the gaps that we started to see um so

16:54

like by breaking down I'll give you an

16:55

example so like by breaking down dbts

16:58

have this monolith with maybe I don't

17:01

know 50 people working on an area of a

17:03

monolith and then you break that down

17:04

into Data products you then start to

17:07

realize well we didn't really have clear

17:09

ownership with that like who owned it

17:11

like people were contributing together

17:13

as maintainers maybe but who owns who

17:16

owns this data asset who actually who is

17:18

the team that do it and that's where we

17:20

started to realize well you need kind of

17:23

metadata over the top to start labeling

17:25

things like that or we also had this

17:27

other symptom coming out because we had

17:29

all of our code in one place it was very

17:31

easy for like team a and Team B to use

17:33

data between each other and not really