KNOW the difference between Data Base // Data Warehouse // Data Lake (Easy Explanation👌)

Chandoo

Подписаться 656 тыс.

Просмотров 508 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

28 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 510

@MinhNguyen-ih5dt 2 года назад

I have watched or read many explanation about the differences among these 3 terms, but so far this video is the simpliest yet cleariest and easiest to understand. Thanks a lot!!!

@chandoo_ 2 года назад

Wow.. thank you for that 😀

@udaynarri967 2 года назад

Exactly, this is how I feel. Thanks Chandoo.

@theh1ve 2 года назад

I came to the comments to say the same thing! Thank you for this simple, illustrative explanation.

@hasanwasti7227 2 года назад

@@chandoo_ qqq

@ayasrhan9751 2 года назад

i very agree

@RAZREXE 2 года назад

There is no other video on youtube that explains DB/DW/DL this easy. Really appreciate the time and effort you put into making these videos.

@EternalEvanesce 2 года назад

Thanks Chandoo. RU-vid algo was brilliant today suggesting me this goldmine!

@sandeep4uin Год назад

Thanks Chandu for making these concepts so simple to understand. Whenever I get confused I just refer to your videos for quick and accurate understanding of the concepts.

@morris5984 2 года назад

Just found your channel. I’m sharing your videos with my team that is a bit behind on these concepts. Thanks!!

@mmiltenburg Год назад

Very nice for non data specialists. I was searching for basic explanation and that's what you gave me!

@machinimaaquinix3178 Год назад

Short, sweet and right on point to help quick learning, you got a new sub!

@zwelimjanepatric1584 2 года назад

I have always struggled to understand what a datawarehouse is but this video made it so simple to understand thank you

@asadullahmalik1503 8 месяцев назад

Excellent video, with great and user friendly explanation. Loved it

@bmcseal01 2 года назад

Wow, I didn't think I'd learn anything, but I learned some more about OLAP (DW) vs OLTP (DB).

@Ravi-Krishna 2 года назад

Your teaching style is simple and superb, thank you.

@mayank.kr.30 2 месяца назад

Great explanation and very easy to understand example.

@francksgenlecroyant Год назад

Perfect explanation. I immediately subscribed 👊👊👊

@abhishekprelog 2 года назад

This is the first video I saw on your channel and it made me instantly subscribe. Brilliant explanation.

@chandoo_ 2 года назад

Thank you and welcome aboard Abhishek.

@derrickmakhoba5279 2 года назад

I have learned a lot from you and thank you very much for clearly explaining 🙂

@chandoo_ 2 года назад

You are welcome Derrick. 😀

@naazlyhameed8468 2 года назад

Awesome.. understood data lake for the first time

@chandoo_ 2 года назад

Glad it helped

@surfh3r0 Год назад

love it the notepad presentation! nice explanation

@dankchan420 Год назад

this is great. thanks for the clear explanations

@m_subir 2 года назад

Very simple yet effective articulation!!

@NiKO...... Год назад

Hello Chandoo thanks for explaining things so clean for us!!!

@lilig9239 2 года назад

The best explanation that I heard. Thanks!

@dominiquez5643 11 месяцев назад

Thank you so much for the time put in your videos! extremely helpful!

@navinrangar2626 Год назад

so clean thanks man!

@veebee3969 Месяц назад

Thank you so much. Clear video.

@yidu8496 Год назад

Very clear explanation! Thank you so much!!!

@Kushcherry_Jollyjuana Год назад

Why SQL for database and not also Power BI and Excel? Also that punchline at end 😂 This was fun and informative. 🔥

@zainabhabeeb3157 2 года назад

The best explanation 👍🏻👍🏻great job

@chandoo_ 2 года назад

Glad it was helpful!

@tacorevenge87 2 года назад

A data lake can also be done on-perm and the data lake can do what a db and a dwh do.

@user_1abc 2 года назад

Yeah. His explanations were not very inclusive. Supposedly meant for data newbies.

@christianstreetball Год назад

Mi data youtuber favorito... 👓💡😂

@khaoulamrhar6731 Год назад

Very clear and simple

@pushpalathak5499 Год назад

Very well put! Simple and easy explanation.

@abhaysinghbhosale2748 Год назад

You are simply expert

@gauravmathur56 Год назад

Thank you Chanduu bhaiya ✌🏻✌🏻

@francesdobbins2964 4 месяца назад

This is an excellent video.

@ecwb 2 года назад

Great video many thanks for your time 👏

@MC-8 2 года назад

Very clear and to the point! Can you do a review of GCP vs. Snowflake?

@lebricoleurdudimanche34 2 года назад

Many Thanks for the video 👍

@mdafroze6123 2 года назад

it was smooooooooth...... THANKYOU😇

@vrcks8066 2 года назад

Wow the way u explained so interesting and inspiring ❤💯

@chandoo_ 2 года назад

Thanks VN 😀

@SouhaHMISSI1991 Год назад

Very clear and simple explanation thank you :) Just one point, Big Query is not a data lake, it is a data warehouse, I thought data lakes are called so when the architecture behind is based on hadoop or what do you think?

@wejdah 6 месяцев назад

قسم بالله اسطوري

@ruoxima3773 2 года назад

Very clear! good explaination!

@kingmaker-ky2th Год назад

Please do vedios on azure databricks and synapse analytics ..

@baderalsahli3619 2 года назад

my best mentor ever

@nayeem3905 2 года назад

Hey, Chandoo. Great explanation. Do you have any recommendations where to learn more advanced topics regarding data warehouse? Thank you

@Soulenergy31 2 года назад

This video is Gold Chandoo, thanks a bunch!!!!!!!!!!!!!!

@chandoo_ 2 года назад

Thank you Saul... You know me.. .when I find gold, I must share it.

@Soulenergy31 2 года назад

@@chandoo_ you are a true Optimus Excel leader ! 😀

@xaamirx Год назад

You are a genius

@vijayramaraju1990 2 года назад

your explanation is very clear and well understood, Thank you Sir

@chandoo_ 2 года назад

You are most welcome

@bezawitarage1686 2 года назад

YOU ARE THE BEST THANKZ SIR

@nikhilrao1266 2 года назад

This was awesome

@nageshtalkin3932 Год назад

Excellent

@mdmd5774 2 года назад

Loved this one, keep these coming

@himalayasaikia5762 2 года назад

Very well explained, thanks for this video. I must say you look like Rajesh Kuttrapally though

@chandoo_ 2 года назад

Make way for the fastest man on earth....

@wayneedmondson1065 2 года назад

Thanks Chandoo. Very interesting and informative! I always learn something new at your channel :)) Thumbs up!!

@chandoo_ 2 года назад

You are welcome Wayne.

@SujathaRavikanth 2 года назад

Mr chandu I didn't understand about lake...the rest two 👍

@khaoulaayaou4611 2 года назад

thank you this was very useful

@atmavidyavirananda6353 2 года назад

Amazing, concise explanation. Subscribed.

@chandoo_ 2 года назад

Welcome aboard Atmavidya... 😀

@Pifagorass 2 года назад

Good stuff, what's your take on BAM rediscovered with Activity Schema as Time Series and LTE (mostly materialised views)? Do you see such as something in the middle between databases and data warehouses for analytical workloads or just another modelling approach.

@cavenmasetla8740 7 месяцев назад

I have been looking for an answer and I can't get it. Where do Data Warehouse Developers start? What's the roadmap?

@SACHINKUMAR-px8kq Год назад

Thankyou so much Sir

@zvijer2960 2 года назад

So, to check my sanity, if you only could pick one - DW or DL for storing very precise financial data, which would you pick?

@chandoo_ 2 года назад

You could keep it in a database. If you end up doing analysis or asking questions where the structure of your stored data is not working for you, then you can reshape it and store it (and call it a warehouse).

@akiikikasaija9321 Год назад

Nice, thanks

@RggEverest 2 года назад

Nice

@behrad9712 Год назад

Thank you so much 🙏

@TheyCalledMeT 2 года назад

you can not imagine how often i talked with high management and totally disillusioned them by explaining what a Datalake is. It's just the next buzzword not THE solution to all our problems .. sure it's useful of it's specific use case .. but that's it ^^

@sujitpanda1985 Год назад

Best🤗🤗🤗🤗

@abhideb1986 2 года назад

Hats off to this explanation wow

@chandoo_ 2 года назад

Thanks Abhi... 😀

@naveedahmad1564 Год назад

Wow😍

@kranthimo1 2 года назад

Thank you for great explanation

@tarakarambabuk5348 2 года назад

Thank you.

@aki3774 2 года назад

I don't understand the point why historic information should be put in a different system (the data warehouse). If you wish to delete a product (in this case, a chocolate) from your active product line, why do you even need to delete the item from the db? You could just keep the product and maintain the info aboit active portfolio in the attributes or set up a different table for discontinued/active products. Having a separate system seems like an overly complex way to maintain this information. Can someone explain?

@gmotdot 2 года назад

What about Lakehouse architectures?

@AK-ok2jh 2 года назад

Well explained 👏👏

@pribhi98_ Год назад

thank you!

@monishnagarajan5917 2 года назад

informative video

@MrAlgorhythm Год назад

Thank you

@AnkitGupta-ic6ni 2 года назад

very nice explanation. can Data lake also serve as Data warehouse and we eliminate DW ?

@chandoo_ 2 года назад

It *_can_* provided the Data Lake platform supports warehouse style queries. Some of them do, but not all.

@matejsopor5834 2 года назад

amazing thanks

@chandoo_ 2 года назад

Most welcome

@sanjubell 2 года назад

Crisp and clear... any Lehman can understand..

@mouligodavarthi-yme1701 2 года назад

lots of love from telugu guy....

@chandoo_ 2 года назад

Thanks Mouli garu 😍

@ПавелСтафеев-м4р Год назад

Someday you will definetely create you own company called Awesome Chocolates)

@PrasadNadig26 2 года назад

AWS, Azure are not Datalakes, they are Cloud Platforms, S3, Blob storage are examples of datalake on these platforms. On GCP example is Cloud Storage and NOT Big Query, BQ is not a datalake

@chandoo_ 2 года назад

Thanks for the clarification.

@milliekim5072 2 года назад

Thank you!

@chandoo_ 2 года назад

You're welcome!

@vishalmokashi3163 2 года назад

Which is better ETL tool 'Informatica' or Power Bi.

@chandoo_ 2 года назад

There is no easy answer for this kind of questions. ETL choices depend on existing systems, DWH architecture, technical competencies and person preferences. If you are learning, then I say learn Power BI first as it has a wider implication. All the best.

@vishalmokashi3163 2 года назад

Thank u . Actually I join Oracle + informatica course 3 months ago .

@Dada-gk9ic 2 года назад

So basically, a single business, multiplied into 3 because why not right? And oh make it monthly because they can?

@vipulsingh2846 2 года назад

Then what is my SQL and mongo db

@kk-pi4hj 2 года назад

Ok now explain what Data Cake is?

@DemetriPanici 2 года назад

This video did a great job of helping me learn the distinction between these 3 things. Love it!

@chandoo_ 2 года назад

Thank you Demetri... 😍

@sarago99 2 года назад

Simple to start with. No PPT slides, just notepad is enough to explain ❤️ Thank you bro. Keep up your good work 👍

@amirmalekahmadi9910 2 года назад

Wise men can explain sophisticated things in a way that a 5-year kid can easily learn! Congrats Wise Man!

@chandoo_ 2 года назад

😍 That is a beautiful compliment. Thanks Amir.

@abdulrahmanbinillyas5944 4 месяца назад

I have seen many videos but this explanation is very nice and clear

@ravinaikwadi9899 2 года назад

Little correction - data warehouse is a system and/or db where Hundreds of heterogeneous dbs(eg- chocolate db, biscuits db, candy, icecream dbs) or file based systems like excel xml are altogether modelled/stored/streamed using ETL(tool) for data analytics & applications downstreaming, data science & AI build purpose also.

@ravinaikwadi9899 2 года назад

@@ChrisSmithFW Yeah, but he forgot to mention so.

@Morgue12free Год назад

I believe that's what he said. His explanation is just a lot more understandable than yours.

@Jishnu_OnTheRocks 11 месяцев назад

Your answer sounds like quoted from an NCERT textbook and his is more like a next door tuition teacher

@Lividbuffalo 8 месяцев назад

Wtf

@kameshk6188 2 года назад

I dont think any other video in the internet explains this difference as clearly as this video. Thank you brother. Keep posting more videos to educate us.

@patrickschardt7724 2 года назад

I think because of your clear and concise points and humor, I learn more from you than other Excel tutorial channels. Keep up the great work.

@chandoo_ 2 года назад

Aww.. that means a lot Patrick :)

@paulrprichard 2 года назад

In a typical database there will be transactions taking place like insert of a table row, update of a table row, read of a table row that are in line with a set of business cases. In a datawarehouse there will be analysis taking place to across multiple rows from multiple tables. A data lake is where data goes to get drowned.

@chandoo_ 2 года назад

"A data lake is where data goes to get drowned." 😂😂😂

@rajivjani8594 Год назад

Super! In just 8 minutes, you have put such a clear picture of data base, data warehouse and data lake, that I can never forget and in future, any time I deal with these terminology, I have crystal clear idea of what am I dealing with! You are a GREAT teacher Chandoo and I really appreciate your effort!

@aiasaiascon3894 2 года назад

One more comment for me. The best, most simple, laconic, yet rich, explanation about the diffs of the terms.

@edgarmartinez2710 2 года назад

I had no idea about data warehouse or data lakes. Thanks Chandoo for sharing your knowledge and the great breakdown of each.

@PVivekmca 6 месяцев назад

You are tressure in you tube

@eshwarsai5027 Год назад

One of the finest explanations. 👍 Loved it ❤️

@chandoo_ Год назад

Glad you liked it!

@rbogomil 2 года назад

Great job, clear explanation and I also enjoy your humor. Would be great if you could create a video describing the difference between data scientist, engineer, analyst and architect. Kudos on your excellent work!

@ryanshannon6963 Год назад

If you're starting in I.T. doing analysis type work, you'll start as an Analyst. This can be anything from reporting, automated feed maintenance/RCA, and even development. Most of the above 3 (maybe save for Data Scientist) start here. Data Engineer is probably the most logical next step from analyst. You'll definitely be doing more development and analytical work as an analyst prior to this. This shifts your scope from retrieving data from a data warehouse/db/lake (lake is quite rare for a run of the mill analyst), to actually designing and some possible light architecting of table/schema structures for data to import into from other sources (typically starting as transactional information into a database from an app, or maybe an external source of some sort). Typically as an engineer you won't start on data warehouse modelling until you've had some experience with general transactional architecting/engineering since the data within a warehouse shouldn't be updated/deleted, only inserted. It will be deleted, possibly if you've archived it in some situation (like data that's over x-years old and based on specific policies), but even then it probably wouldn't be deleted. If the architecture allows, you may just duplicate the tables, or partition them in some way and then archive the older pages. They may also determine certain structural recommendations (rowstore vs columnstore table structures, for example, or using NoSQL vs relational databases), but usually it's in concert with an Architect if the process being designed is large enough, or has significant impact, especially in terms of performance. However, after discussions between Engineers and Architects, the Engineers (and to a lesser extent, Analysts) will IMPLEMENT the requisites of decided Architecture. Engineers are typically more hands on than Architects, but Archs may get their hands dirty if something is largely conceptual and they want to start plugging away earlier in the phase to ensure design solidity. Data Architect is anything from designing the schema for your transactional infrastructure (your primary database), data warehouse, or even data lake, as well as helping navigate and determine how to import data into those repositories, as well as even more expansive things such as CI/CD pipelines, *maybe* networking tasks if you're familiar enough with that (usually system administrators do that, though), or even helping implement connection string/authentication against your cloud resource targets originating from nearly any source caller (on premises machine, like a developer computer, a VM hosting an app service, CI/CD agent, or a completely separate cloud service not native to your cloud service, even on a completely different domain or client server). An Architect is going to be responsible for HOW disparate system objects are going to interact with each other and any potential issues given certain implementations or design sequences. Typically Architects are going to have some knowledge as to what different approaches are available and determine which makes sense given what's required for the need or problem that needs resolution. As an Architect you're not expected to know how to implement everything as if you were doing all the work yourself. However, having a basic understanding of the limitations of each element in the design will definitely help you determine which is possible and which may not be earlier in design phase, which helps mitigate wasted developer time later during spikes (Proof of Concept phases) and help with further engineering alignment tasks. Most people consider scientists as the babies in the room because the data they require should be perfect in terms of not needing to accommodate any changes to their representations outside of any algorithmic modelling is concerned. It's entirely possible a Scientist will ask the Engineer to modify schema and data to accommodate some sort of analysis or data modelling they're trying to complete. It's not a-typical for an Engineer to work closely with a Scientist, but not typical for the Scientist to work with the Architect, aside from initial standing up of a new Data Warehouse or Data Lake. Typically the Engineer maintains or may make the every-day changes to those structures once the inputs/outputs/transformational processes have already been established. Scientists are typically Statisticians or anything having to do with applied mathematics. They will also typically work with code that isn't strictly SQL, such as Python, R, Power BI, DAX, (maybe MDX, but I think that's fallen largely by the way-side), etc...Scientists are tasked with supplying the answers to complex problems for the business using quantitative analysis. These are the people that determine what Ads you may see given your previous and most recent search history. Something you searched for 3 years ago may not be as relevant as something you searched for yesterday. That would be a typical example of what a Scientist may do. Also, Google translate, things like that will be developed by the Scientist, but the Architect will design the bridges to source that data whereas the Engineer will make that design a reality. The Analyst will make sure data makes sense as it starts trickling through the design process and if there's any issues, the Analyst and maybe working with the Engineer will troubleshoot the why/how and determine a fix where either of them may implement that fix to ensure it works as intended. If you look at it as a decision tree, it may look something like: Analyst > Engineer > Architect Analyst > Engineer > Scientist Analyst > Scientist (again, typically short cut by a Masters in Statistics or similar) Hope that helps!