Тёмный

Data Modeling Challenges - The Issues Data Engineers & Architects Face When Implementing Data Models 

Seattle Data Guy
Подписаться 96 тыс.
Просмотров 25 тыс.
50% 1

Data modeling is an important skill for data engineers.
But you will face lots of challenges when actually diving into the logic and business processes that are involved to create said data.
The goal of data modeling is to represent the business, its transactions and so forth so that others can utilize the data.
But its far harder than just taking a star schema and slapping it on top of your data.
If you enjoyed this video, check out some of my other top videos.
Top Courses To Become A Data Engineer In 2022
• Top Courses To Become ...
How And Why Data Engineers Need To Care About Data Quality Now - And How To Implement It
• How And Why Data Engin...
If you would like to learn more about data engineering, then check out Googles GCP certificate
bit.ly/3NQVn7V
If you'd like to read up on my updates about the data field, then you can sign up for our newsletter here.
seattledataguy.substack.com/​​
Or check out my blog
www.theseattledataguy.com/
And if you want to support the channel, then you can become a paid member of my newsletter
seattledataguy.substack.com/s...
Tags: Data engineering projects, Data engineer project ideas, data project sources, data analytics project sources, data project portfolio
_____________________________________________________________
Subscribe: / @seattledataguy
_____________________________________________________________
About me:
I have spent my career focused on all forms of data. I have focused on developing algorithms to detect fraud, reduce patient readmission and redesign insurance provider policy to help reduce the overall cost of healthcare. I have also helped develop analytics for marketing and IT operations in order to optimize limited resources such as employees and budget. I privately consult on data science and engineering problems both solo as well as with a company called Acheron Analytics. I have experience both working hands-on with technical problems as well as helping leadership teams develop strategies to maximize their data.
*I do participate in affiliate programs, if a link has an "*" by it, then I may receive a small portion of the proceeds at no extra cost to you.

Развлечения

Опубликовано:

 

29 июл 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 34   
@SeattleDataGuy
@SeattleDataGuy Год назад
If you guys want to learn more about data engineering, then sign up for my newsletter here seattledataguy.substack.com/ or join the discord here discord.gg/2yRJq7Eg3k
@RodrigoBocanegraCruz
@RodrigoBocanegraCruz Год назад
Different data models were defined to respond to different scenarios: - 3NF for transactional applications - Dimensional (star or snowflake schemas) or flat for reporting - Datavault, Anchor, Hook for integration Sometimes you could even decide not to model the data at all, as many practitioners are doing, which could work for specific scenarios. Now we also have hybrid tables, so models are also being challenged for those previous use cases. Perhaps Anchor with its 6NF could be an interesting approach when having hybrid tables? It sounds appealing to me! But regardless the technology challenges and innovations, when we consider the overall enterprise and overarching data needs, I would say that the most important models for any scenario are the enterprise and conceptual models (not technical models) where you define the key business elements (data domains) and relationships relevant for the business. After that, I would even suggest you can use whatever you want on every layer, as long as the physical model is aligned to the business model. I think that's why datavault with automation in place, is very popular in many industries, because it enforces you to understand the conceptual model and focus on the business rather than the technical implementation. Certainly datavault without automation is just a pain in the integration, and it certainly doesn't consider lakehouse architectures where you could physically design your lake folders around data domains and define guidelines for relationships (the devil is in the details). But sadly, as you suggest, people usually think conceptual models are easy to create or unnecessary, and that's a wrong step towards proper scalability. Even if you follow a federated approach (data mesh or equivalent) you'll have eventually to integrate data from multiple domains and only enterprise and conceptual modeling can soften the burden. This is where business and data architecture play a crucial role in defining proper data domains and linking that with data development, delivery, management, and governance. Thanks for the video!
@William-B
@William-B Год назад
Could you do a demonstration video? Here's what we want to model -> Here's one way to model it
@chrisformoso
@chrisformoso 11 месяцев назад
this and the logic of microdecisions that goes into building the model would be gold!
@EcZachly_
@EcZachly_ Год назад
That data modeling live will be so good!
@JoeG2324
@JoeG2324 Год назад
My team certainly doesn't follow data modeling standards and it works perfectly fine for us. One of the main reasons we don't is we simply don't have enough time or people to properly put together a data model. We handle data ingestion, transformations and reporting for many many department and it would be impractical to build data models. What we do is we have processes to that build tables for different subject areas. So, its technically one table that fits a subject area and these tables get joined with other datasets etc.. i'm simplifying this, in general this is how we work. If my team covered only one subject area and had specific questions that needed to be reported on then yeah a data model might work in our case.
@SeattleDataGuy
@SeattleDataGuy Год назад
Yeah, I think a lot of teams take it on in different ways. One of which is to have looser practices, there are always tradeoffs of course
@moona5454
@moona5454 7 месяцев назад
Basically gold layer about a certain domain and then dbt to create OBTs joining multiple domains
@TaeLMG
@TaeLMG Год назад
@SeattleDataGuy Hello, I had a question. I just finished my BS degree in Software Development I was wondering is that enough as far as the Degree aspect to get into Data engineering? When I search around for jobs I see a lot of Data Engineer positions require a Master degree. Which I do not have.
@user-xi4cz6qz1i
@user-xi4cz6qz1i 7 месяцев назад
What do you think about a graph data model?
@BJTangerine
@BJTangerine Год назад
Congrats on the new home
@SeattleDataGuy
@SeattleDataGuy Год назад
Thank you!!!
@hwy9nightkid
@hwy9nightkid 3 месяца назад
what is the best open source data modelling option
@Kondaranjith3
@Kondaranjith3 Год назад
THANK YOU VERY USEFUL
@SeattleDataGuy
@SeattleDataGuy Год назад
Glad it helped
@thedailyepochs338
@thedailyepochs338 7 месяцев назад
Loved this video. Thank you
@SeattleDataGuy
@SeattleDataGuy 5 месяцев назад
glad you liked it!
@sarvesht7299
@sarvesht7299 Год назад
In brief, when the requirements are clear, what may go wrong while designing a data model? Like the challenges we might encounter.. could you please share those?
@jppbkm
@jppbkm Год назад
The requirements will turn out to be wrong 😂
@cnaeuspompeus3188
@cnaeuspompeus3188 Год назад
Here is one: perfect model with 3 Fact tables and 45 Dimensions(!), model was made almost perfect ...cool yeah !! However 25%, 33% and 45% of FKs for the 3 Fact tables were zeros, or completely missing (without enforced constrains of course) , due to “successful” migration years ago , of old DB to the OLTP DB where this star schema was reading ... modeling work was done , PMs were happy; and time for me to come , to do “reporting”... biggest joke in my career
@sarvesht7299
@sarvesht7299 Год назад
@@cnaeuspompeus3188 ha.. if it was a successful migration, then how were those tables empty. And how did you overcome this issue ? Reloaded the tables ?
@ivani3237
@ivani3237 8 месяцев назад
@@sarvesht7299 it's often a typical situation in many DB's - when at some point you add a new property (ex. "') to you fact table, and the old data wasn't updated (has NULL in it), and new data came with populated
@ivani3237
@ivani3237 8 месяцев назад
The requirements are constantly changing over the time, plus new data is coming.
@mirlamontano6640
@mirlamontano6640 8 месяцев назад
awesome video!
@SeattleDataGuy
@SeattleDataGuy 8 месяцев назад
Thank you!
@datamandy8975
@datamandy8975 Год назад
why the camera angle always wiggles from zoom in to zoom out..making my head burst...content is always good from you 🤗can you please fix this alone
@SeattleDataGuy
@SeattleDataGuy Год назад
Ok, I can see if we can do that less!!
@calypso5629
@calypso5629 Год назад
There are some background noise.
@emonymph6911
@emonymph6911 9 месяцев назад
congrats on the new house
@SeattleDataGuy
@SeattleDataGuy 5 месяцев назад
thank you!
@santoshbhamidipati8455
@santoshbhamidipati8455 9 месяцев назад
Can you give a demo instead of theoritically explaining as many of us really need some help on real world example of data modelling for cracking interviews
Далее
НРАВИТСЯ ЭТОТ ФОРМАТ??
0:37
Просмотров 7 млн
А где Ахмат?
1:00
Просмотров 8 млн
Мыла наелся
0:21
Просмотров 2,5 млн