Тёмный

Data Vault vs Traditional Data Warehouse Architectures 

nullQueries
Подписаться 15 тыс.
Просмотров 56 тыс.
50% 1

Опубликовано:

 

30 сен 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 39   
@nullQueries
@nullQueries 3 года назад
What do you think of the data vault compared to the dimensional data warehouse? Have you built both? For more Data warehouse options: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-Tff34jj_V-0.html
@JimRohn-u8c
@JimRohn-u8c 2 года назад
I would love to see more videos on how to implement this. Wish there was a Udemy course on how to implement this.
@norpriest521
@norpriest521 2 года назад
@@JimRohn-u8c I love how he mentioned at the end that data vault may not be the best option for some scenario. This shows that it's not about which is the better one, but it's about which one is more reasonable to use in specific scenarios.
@willi1978
@willi1978 Год назад
The idea of Data Vault sounds nice. But using an ETL Automation Tool like WhereScape etls can be adapted very nicely too and with less overhead
@AlbertoSimeoni-wi9wj
@AlbertoSimeoni-wi9wj 9 месяцев назад
I think the main problem is computational power /time to build every link tables. The fact that in the end you build a reporting layer that is in fact a dimensional model vanish all the effort. The clear advantage is having the original keys in a staging area and avoid to change the extractors. But this is all made having in mind old row and disk based databases. With in memory columnstore database (SAP HANA) the link logics is not necessary, it can be all virtual. We have customers with all dwh / BI logic that runs on the erp database with tables over 100 million rows, all with virtual modeling without persistence.
@stephanzhechev141
@stephanzhechev141 Год назад
This is a wonderful video. Unfortunately for me, I read 450 pages from Dan Lindstedt's book introducing the data vault 2.0 architecture. This is, hands down, the worst book I have ever ready. It is just horrible. However, it does contain about 7 good ideas and this video captures all of them in a nicely presented coherent way. Thank you!
@CrazySw3de
@CrazySw3de 3 года назад
I enjoy your videos quite a bit, just a few pieces of constructive criticism: I feel like a little bit more space between sentences to let the viewer digest what is being said/shown would help a lot. I like the clean look of the visuals, but the text labels etc. help make things easier to visually process. I think the visual example you did with the tables in this one was good, more real examples like that for what these concepts actually look like in the real world, even just as examples helps drive the points hope. Looking forward to seeing your channel grow, keep up the good work!
@nullQueries
@nullQueries 3 года назад
Thanks for the feedback. I'm trying to keep these as 5 minute overview videos, which is a challenge with some of these dense topics. Still trying to work out the pacing and how much detail to cram in. I have some ideas for more in depth, slower paced example videos to go along with the overviews. Just need to find the time!
@sued12345
@sued12345 6 месяцев назад
@@nullQueries for me you don't need to change anything. I mean a short video will not replace proper training, but helps a lot. Thank you for your effort.
@danielolaru2496
@danielolaru2496 2 года назад
I went from the 3NF video to the dimensions one to this one and I feel like the only advantage I see is the dimension/Kimball one. This data vault seems just overkill. The storage will increase exponentially with all the extra keys needed and with very large storage of millions/billions of rows the performance I suppose will be greatly impacted when querying all those keys. Why is this an easier ETL solution? Am I missing something?
@TheR0yalBeast
@TheR0yalBeast 2 года назад
Hi Daniel, I think a key point of the data vault to understand is that it is exceptionally good at showing lineage. In my point of view it is only a good solution when you are dealing with many different data sources which need to be combined. A great example of a project I have helped on was combining 10 different SAP clients at a manufacturing company. Each is customized slightly, the data may be stored in the same fact table, say sales, but have different indicators or flags etc. modifying it. WIth the ETL solution you would do a one off ETL to land it in a standardized table; however, in 4 years you will need to spend weeks of development trying to figure out where the mistakes are and what transformation occurred.
@SamuelLees-jv8ji
@SamuelLees-jv8ji Год назад
I see a lot of advantages with data vault but I just can't see it as an advantage over dimensional warehouse for my business context: e-commerce platform + CRM + billing system + marketing campaign system because all of these sources are quite static. Would be great to get feedback on this.
@husanturdiev
@husanturdiev Месяц назад
Hi Daniel! Actually, storage cost with Data Vault is in average a lot less than with Dimensional data modeling. I would suggest two main factors moving to Data Vault: 1. enormous amount of data, 2. complexity of data and business processes. So, when building Data Vault, you'll make a data model that's change tolerant - i.e. if something changes in business, or in business processes, data model will remain, which is not the case for dimensional data modeling. Data Vault is extremely hard and expensive to create, but cheap to maintain, in dimensional data model it's easy and cheap to create, but expensive to maintain in the long run. Therefore there are hybrids - data vault + dimensional modeling where you first model data in vaults, then model dimensions and facts on top of data vault
@christopherbronson3275
@christopherbronson3275 3 года назад
Can I just say "Dimensional Datamart" is my favorite cyberpunk term
@MrCutlash
@MrCutlash 2 года назад
Data vault is the curated layer in a data lake. And they have a very specific design... But really its an inmon/operational design
@guillaumegiroux9425
@guillaumegiroux9425 2 месяца назад
My company is moving from a Datalake with a Raw and Curated zone, to a Datalake/Datavault with a Raw and Certified zone. We are a huge bank with 9billions$ of revenue. I feel it’s a big gamble, the current system, while having governance flaws, isn’t that bad and I wondered if all the money will be genuinely adding value. What do you think?
@paulheadey265
@paulheadey265 10 месяцев назад
My data engineering team have built many data vaults, but could never quite articulate to me as a business leader why? This has been very educational for me in explaining the benefits vs complexity. The pace that business is changing and the number of new data sources that become available makes a data vault seem a more obvious choice. The business still gets its Inmon Kimble model, but the foundational data structures in the Vault provide more capability to make changes to them. That's what this inferred to me. I hope I am on the right mark.
@srikanthmanduri6429
@srikanthmanduri6429 Год назад
One of the best video's out there regarding Data Vault modelling
@michaelenriquez_
@michaelenriquez_ 3 года назад
thanks for make this kind of videos, i really appreciate it, they are so useful for people like me who are learning about it
@pedropradocarvalho
@pedropradocarvalho 2 года назад
Would it happen that you guys have a transcript of this video? maybe posted in a blog post?
@Sam-gj4hf
@Sam-gj4hf 3 года назад
First time watching your videos and I absolutely love them! Subbed and liked. It'd be even more awesome if you could allow for an extra second to digest what you're saying. It's a lot of useful information. But even if you don't change anything, I'll still be a fan! Thank you for this!
@bytedonor
@bytedonor 7 месяцев назад
Well explained in pictorial format. But there should be some use case or an example so the newbies can understand more easily.
@pb78pb
@pb78pb 3 года назад
Hi. Thank you for this overview video. Do you have also a webpage where you can be contacted? Would be happy to get your thoughts about DWH automation (we are the creators of the Datavault Builder tool). Regards
@ardee3949
@ardee3949 3 года назад
Great videos .. very informative ...can you do a quick comparison between Redshift & Vertica? an overall evaluation?
@yogeshbharadwaj6200
@yogeshbharadwaj6200 2 года назад
very well explained...tks a lot
@nullQueries
@nullQueries 2 года назад
Glad it was helpful!
@SjeetjeMineetje
@SjeetjeMineetje 3 года назад
Very well explained with good examples, this is very helpful!
@treelo11
@treelo11 2 года назад
This video is very good but I need to clarify the ETL Process. Supposed I have a few raw files yet to be stored. They are placed inside the data lake unmodified. From there, I insert the data as hubs, link tables and satellites tables into the raw vault, creating surrogate keys along the way. Is that right? And what does 'since objects in each layer never connect to each other' mean? 4:01
@ivani3237
@ivani3237 2 года назад
it's mean that no any hard foreign keys, but logically they of course connected
@vidak92
@vidak92 7 месяцев назад
Really, the best explanation.
@kabirsingh6582
@kabirsingh6582 2 года назад
Great content..subscribed!
@galeop
@galeop 3 года назад
Really good video! Thank you! Quick question: what do you mean by "Business logic"? Do you mean that kind of logic that would be used with an MDM, to control whether new attributes about an entity should be added or ignored (eg if we have conflicting phone numbers for a customer)?
@nullQueries
@nullQueries 3 года назад
I'm using Business Logic to represent anytime some sort of business rule alters source data. Sometimes it's explicit (ie: phone numbers are always stored in a certain format). And sometimes it's just tribal knowledge (ie: Some sources call it a customerID and some a consumerID. But everyone in the office knows it's referred to as ClientID. So we'll convert to that naming so it's easy for users to consume. ) A good MDM should handle this but it depends on how it's implemented, what it catches, and where in the architecture it makes the changes. But for the DV this would happen in the business vault layer, as the raw vault should reflect the sources.
@galeop
@galeop 3 года назад
Thank you!
@moverecursus1337
@moverecursus1337 Год назад
a little bit complex
@thghtfl
@thghtfl Год назад
All those fancy pictures make zero sense without real live examples, just think about it
@mosa36
@mosa36 2 года назад
Nice video, where can we learn about the other data warehouse format?
@juliustuckayo8973
@juliustuckayo8973 2 года назад
Great video, I stumbled upon this channel by accident today, after reading an opinion piece by Bill Inmon on why Snowflake isnt a data warehouse (on LInkedIn) after watching your video on Inmon vs KImbal i immediately subscribed, great content, what software do you use for the video animations? anyways you've got a new subscriber from Papua New Guinea, keep it up, happy Easter.
@nullQueries
@nullQueries 2 года назад
Thanks for the compliment! I use the adobe suite for all illustration and animations.
Далее
Шоколадная девочка
00:23
Просмотров 454 тыс.
Data Modeling in the Modern Data Stack
10:14
Просмотров 103 тыс.
What is Data Pipeline? | Why Is It So Popular?
5:25
Просмотров 161 тыс.
How to create a Data Vault Model from scratch
10:58
Просмотров 43 тыс.
Should you switch to Snowflake?
4:54
Просмотров 20 тыс.
Is Data Mesh the Future?
5:50
Просмотров 26 тыс.