Definitely want to see more such videos. Would be better to split to small peaces (10-20 minutes) for each particular topic (spark execution model, tunning spark,etc.). But this format is also pretty good.
This is great. Thanks! It is annoying to watch unedited live presentations on RU-vid, especially when you cannot hear what someone from the audience is asking/if audience is asking tangential questions/when the presenter is talking housekeeping stuff. Pls keep doing these screencasts, which take less time to go through and are distraction free.
Thank you very much Patrick! This is Awesome! Very insightful, many of the things you've shown we are using already, so i'm glad :) But i think that these kind of screen cast are really important! The length should be around 30-45 minutes like this one, to keep it focused. As a matter of fact, a really useful one would be "Advanced Spark Execution configuration" How to launch tasks on Standalone / Yarn cluster with the right load on the workers? What's a "Core"? What's a "Worker"? I can elaborate on more of these kind of subjects if you would like. And i'm going to post these kind of screencasts myself as well. Keep on the great job you guys are doing Databricks
That was a wonderful explanation of spark internals. The format is really good , and far better than the video formats generally available.Thanks for putting the extra effort.
Excellent presentation Patrick. The audio was super clear. Something similar on DataFrames would help, the DataFrame meetup presentation has quite bad audio. You mentioned at the start that fixing something starts with knowing it in depth. Completely agree. But I believe that a lot of details of spark's internal components and working are missing. You need to check the code to know in depth, and if you are not a Scala developer, then it's almost impossible, since the codebase is in Scala. It would be great if you could have more such video covering all the building blocks of spark like block manager and their working (eg how remote reads and shuffle read/write happen). A lot of the videos available are for beginners but once you have worked on spark for a while and know the basics and common ways of tweaking it, there is little help available to go to the next level. As Spark community is maturing, you would find lot many people stuck at intermediate levels.
Just FYI, Entire Content is present in Learning Spark book in "Tuning and Debugging Apache Spark" , I have gone through the entire book . but anyways nicely explained, Thanks.
Thanks for posting that. That's really helpful. The format is great and the content is very well presented (the same goes for Chapter 8 of the 'Learning Spark' book, which I just got today).