No video :(

Master Reading Spark DAGs

Afaque Ahmad

Подписаться 5 тыс.

Просмотров 15 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

21 авг 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 57

@afaqueahmad7117 11 месяцев назад

🔔🔔 Please remember to subscribe to the channel folks. It really motivates me to make more such videos :)

@lunatyck05 10 месяцев назад

Done - awesome videos will watch the rest of the series. Would be great to get some databricks oriented videos also when possible

@gabriells9074 8 месяцев назад

this is probably the best explanation I've seen on spark DAG's. Please keep up the amazing content! thank you

@niladridey9666 Год назад

again in depth content. Thanks a lot. Please discuss a scenario based question on todays topics.

@varunparuchuri9544 3 месяца назад

@Afaque asually amazing vedio bro. It's been more than 1 month we are dying of waiting for vedios from you

@afaqueahmad7117 3 месяца назад

A new playlist coming soon brother :)

@pachamattlajyothi249 Месяц назад

@@afaqueahmad7117 waiting

@BuvanAlmighty 9 месяцев назад

Beautiful content. Very clear and crystal explanation. Thank you for doing this. ❤❤

@afaqueahmad7117 9 месяцев назад

Glad you enjoyed it!

@yuvrajyuvas4730 6 месяцев назад

Bro..Can't thank you enough... This is what exactly I was looking... Thanks a ton bro... 🎉

@Fullon2 Год назад

Nice serie about performance, waiting for more videos, tranks.

@HarbeerKadian-m3u Месяц назад

Amazing. This is just too good. Will share with my team also.

@afaqueahmad7117 9 дней назад

Really appreciate it @HarbeerKadian-m3u :)

@CoolGuy 10 месяцев назад

Done with the second video on this channel. See you tomorrow again.

@saravananvel2365 Год назад

amazing explanination ..Waiting for more videos from you

@ManaviVideos Год назад

It's really informative session, thank you!!

@OmairaParveen-uy7qt Год назад

Explained so well!! Crystal clear!

@user-ye6ke9er9d 5 месяцев назад

Doing fantastic work bro.... Keep this up 💪❤

@Learner1234-hv4be 3 месяца назад

Great explanation bro,thanks for the great work you are doing

@afaqueahmad7117 3 месяца назад

Thank you @Learner1234-hv4be, really appreciate it :)

@CharanSaiAnnam 5 месяцев назад

very good explanation, thanks. you earned a new subscriber

@i_am_out_of_office_ 5 месяцев назад

very well explained!!

@balakrishna61 3 месяца назад

Nice explanation.Great work.Thank you .Liked and Subscribed.

@afaqueahmad7117 3 месяца назад

Thank you @balakrishna61, appreciate it :)

@viswanathana3759 4 месяца назад

Amazing content. Keep it up

@afaqueahmad7117 4 месяца назад

Appreciate it :)

@nayanroy13 11 месяцев назад

awesome explanation.👍

@zulqarnainali6560 Год назад

beautifully explained!

@ankursinhaa2466 10 месяцев назад

Thank you Bro!! your videos are very informative and helpful. Can you please one video explaining setting up spark in local machine. That will be very helpful

@afaqueahmad7117 10 месяцев назад

Thanks @ankursinhaa2466, videos on deployment (local and cluster) coming soon :)

@muhammadhassan1640 11 месяцев назад

Excellent bro

@ComedyXRoad 3 месяца назад

thank you

@afaqueahmad7117 3 месяца назад

Appreciate it :)

@jdisunil 7 месяцев назад

your expertise and explanations is like "filtered gold in one can " Can you make quick video on AQE in depth please. 1000 thanks

@afaqueahmad7117 7 месяцев назад

Thanks @jdisunil for the kind words. There's already an in-depth video on AQE. You can refer here: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-bRjVa7MgsBM.html

@SHUBHAM_707 2 месяца назад

Please make a dedicated video on shuffle partition... how it behaves when it's increased or decrease from 200

@afaqueahmad7117 2 месяца назад

Hey @SHUBHAM_707, have you watched this - ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-q1LtBU_ca20.html

@pratiksatpati3096 Год назад

Superb ❤

@subaruhassufferredenough7892 8 месяцев назад

Could you also do a video on Spark SQL and how to read DAGs/Execution Plans for that? Amazing video btw, subscribed!!

@afaqueahmad7117 8 месяцев назад

Hey @subaruhassufferredenough7892, Thanks you for the kind words, really appreciate it :) On Spark SQL, DAGs/Execution plans for both Spark SQL and non-SQL (python) are the same as they are compiled/optimized by the same underlying engine/catalyst optimizer.

@tahiliani22 4 месяца назад

At 16:49, as part of the AQE plan for the larger dataset, the way that I understood is 1 skewed partition was split in 12 and finally we had 24+12 = 36 partitions. We see the same on Job Id 9 at 13:40 that it had 36 tasks. But I heard you say that 36 partitions have been reduced to 24. Can you please help clear the confusion ? thank you.

@rambabuposa5082 4 месяца назад

I think in that AQE Step, AQEShuffleRead reads 200 partitions (as per previous node) from customers dataset, then coalesced to 24 then something happened and make them to 36 thats why that right side node is showing "number of of partitions 36". At left side for transactions dataset, this "number of of partitions 36" is appearing as last value where at right side for customers dataset its appearing as first value. But Im not sure what is that " something"???

@rambabuposa5082 4 месяца назад

Hi Afaque Ahmad At 13:37 you were saying that separate job for shuffle operation that one job for transactions dataset shuffle operation and one for customers dataset. Im bit confused why they need a separate job? As per my understanding, when spark encounters a shuffle operation, it just creates a new stage within that job right? When I execute the same code snippet, it create 5 jobs totally: two for metadata (expected), two for shuffle operation (not expected) and final one is for join operation. Many thanks

@rambabuposa5082 4 месяца назад

Hi Afaque Ahmad At 7:24, you were saying that a batch is a group of rows and its not same as a partition. Shall we assume something like a group of rows read from one or more partitions available in one or more executors (not from all executors) to match that df.show() count?

@tejasnareshsuvarna7948 Месяц назад

Thank you very much for the explanation. But I want to know what is your source of knowledge. Where do you learn these things?

@tandaibhanukiran4828 5 месяцев назад

Hello Bro, I have a doubt. at "23:30 min" playtime, it was mentioned that AQEShuffleRead: coalesced partitions into 1, then will the other worker nodes will sit ideal ? In the Video it is mentioned that even after shuffle, all A's will be in 1 partition and B's in another partition. can you please explain me, what do you actually mean by Number of Coalesced Partitions=1

@TechnoSparkBigData 11 месяцев назад

Thanks for this. When is the next video coming sir?

@afaqueahmad7117 11 месяцев назад

Coming soon in the next few days! :)

@satheeshkumar2149 6 месяцев назад

While stages are created whenever a shuffle occurs, how are jobs created?

@afaqueahmad7117 6 месяцев назад

Hey @satheeshkumar2149, jobs are created whenever an actions is invoked. Examples of action in Apache Spark can be - collect(), count()

@satheeshkumar2149 6 месяцев назад

@@afaqueahmad7117 , but in some cases we have more than one job being created. This is where I find difficulty in understanding

@abdulraheem2874 11 месяцев назад

Bhai , can help to make a video on spark architecture as well for beginners

@afaqueahmad7117 11 месяцев назад

Haan bhai, ayega kuch time mein :)

@RishabhTakkar-o6l Месяц назад

How do you access this spark UI?

@user-dx9qw3cl8w 9 месяцев назад

why shuffle partitions made 200 hundread. when we have only 13 partitions max. at 14:55

@afaqueahmad7117 9 месяцев назад

By default, shuffle partitions are 200, hence you see that in the 'Exchange' step. The reduction (optimization) to fewer partitions takes place in the 'AQEShuffleRead' step below.