Thank you for the wonderful video. I feel the 4th point is quite contradicting as repartition and coalesce involves data shuffle thus impacting the performance. And you're suggesting to use them. Could you please explain how using them would improve performance?
Hi Sathish, if you see data skewness then you are advised to use these. When the Spark job is taking hours to complete due to data skewness, usage of these will help in reducing the job runtime. That's how it helps in improving the performance.
Hai Saravana, could you please explain DAG more to understand optimization techniques. How to get to know which part of query takes more time and how to tune that query
Hi Sravana, your content is amazing, it has helped me understand lots of Scala and Spark related concepts, Thanks for this. Also do you maintain any repo or drive where you add these PPTs and the practice codes,?