Тёмный

22. How to select Worker/Driver type in Databricks? 

CloudFitness
Подписаться 19 тыс.
Просмотров 9 тыс.
50% 1

1. Usually, drivers can be much smaller than the worker nodes.
2. More cores for your DBUs, is more parallelism per DBU (but on smaller partitions because of less memory/CPU)
3. Allows for data to be split into bigger == fewer partitions to allow less shuffling.
4. Storage Optimized is especially relevant for queries with many joins/groupBy’s.
5. Choose Memory Optimized for less intensive ETL tasks.
6. Use Compute Optimized for more Parallelism and iterative tasks.
7. Do not choose HDD Clusters as they are less performant.
Connect with me on Linkedin
/ bhawna-bedi-540398102
Instagram
www.instagram....
Data-bricks hands on tutorials
• Databricks hands on tu...
Azure Event Hubs
• Azure Event Hubs
Azure Data Factory Interview Question
• Azure Data Factory Int...
SQL leet code Questions
• SQL Interview Question...
Azure Synapse tutorials
• Azure Synapse Analytic...
Azure Event Grid
• Event Grid
Azure Data factory CI-CD
• CI-CD in Azure Data Fa...
Azure Basics
• Azure Basics
Data Bricks interview questions
• DataBricks Interview Q...

Опубликовано:

 

11 сен 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 18   
@ayushsrivastava6494
@ayushsrivastava6494 3 месяца назад
Honestly, I've been seeing your tutorials from a long long time and nothing beats the way you teach! You legit are the best out there. 🙌🏼
@naveenkuppili2889
@naveenkuppili2889 2 года назад
Love the way you explain, appreciate if you can put up videos on performance tuning (reading plans, DAGs, repartition vs coalesce ) etc. Thanks again.
@cloudfitness
@cloudfitness 2 года назад
Hi Videos on few of those topics are already posted here:- ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-sIlayzDrP48.html ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-bKAneQeQe7s.html ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-a2ehHq3DJrw.html
@ajinkyadhoke4713
@ajinkyadhoke4713 Год назад
One of the finest video Bhawana....more power to you...🔥🔥🔥🔥
@Amin-tx4ku
@Amin-tx4ku Год назад
Never came across such an elegant explanation.Thank you.
@ramaraju3273
@ramaraju3273 2 года назад
Awesome 👍👍👍... very informative video as always... keep up your great work... thank you.
@kartikeshsaurkar4353
@kartikeshsaurkar4353 2 года назад
Very informative ! Thanks for providing the details with real time scenarios
@PavanKumar-tt8mm
@PavanKumar-tt8mm 2 года назад
I am exploring more knowledge from you on Databricks. I am waiting for another concept from you.
@tanushreenagar3116
@tanushreenagar3116 8 месяцев назад
Best explanation 👌 ever it cleared 👏 many thing's. Thanks a lot.
@deepjyotimitra1340
@deepjyotimitra1340 2 года назад
Very useful information.
@shubhamaggarwal3676
@shubhamaggarwal3676 2 года назад
It would be nice if you make a video i.e. on how to choose optimal no of resouces like shuffle partitions, repartition and worker nodes + how to optimise the jobs by analysing spark ui
@muthumulaprasanna
@muthumulaprasanna 3 месяца назад
Great explanation.. One question how to select memory? Like 14 gb Or 24gb etc
@Akshay50826
@Akshay50826 Год назад
A big thank you☺️
@AmitKumar-kh4ht
@AmitKumar-kh4ht 2 года назад
Very nice 👌
@Itachi_88mm
@Itachi_88mm Год назад
Delta cache also stores data from remote location to local node so that data is read from local node to avoid network traffic then how different is it from Broadcast variable kindly help us understand this difference
@MohammedKhan-np7dn
@MohammedKhan-np7dn 2 года назад
Hi Bhawana, Thank you alot for the knowledge sharing session. Can you share the link to go through the ppt for our reference.
@pkrmettu28
@pkrmettu28 Год назад
Can you please teach power bi and python real time scenarios covering basics and interview?!
@nakkaeswaraoeswar2140
@nakkaeswaraoeswar2140 Год назад
Thanks a lot beauty.....Can you explain about what is Azure services and Azure Sql